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Preface 


New  concepts  such  as  network-centric  operations  and  distributed  and 
decentralised  command  and  control  have  been  suggested  as  techno¬ 
logically  enabled  replacements  for  platform-centric  operations  and  for 
centralised  command  and  control  in  military  operations.  But  as 
attractive  as  these  innovations  may  seem,  they  must  be  tested  before 
adoption.  This  report  assesses  the  effects  of  collaboration  across  alter¬ 
native  information  network  structures  in  carrying  out  a  time-critical 
task,  identifies  the  benefits  and  costs  of  local  collaboration,  and  looks 
at  how  ‘information  overload’  affects  a  system. 

A  joint  US/UK  study  team  conducted  the  research  described  in 
this  report.  In  the  United  States,  the  research  was  carried  out  within 
RAND  Europe  and  the  International  Security  and  Defense  Policy 
Center  of  the  RAND  National  Security  Research  Division,  which 
conducts  research  for  the  US  Department  of  Defense,  allied  foreign 
governments,  the  intelligence  community,  and  foundations.  In  the 
United  Kingdom,  the  Defence  Science  and  Technology  Laboratory 
(Dstl)  directed  the  work  and  participated  in  the  research  effort.  Dstl 
is  the  centre  of  scientific  excellence  for  the  Ministry  of  Defence,  with 
a  mission  to  ensure  that  the  UK  armed  forces  and  government  are 
supported  with  in-house  scientific  advice.  RAND  has  been  granted  a 
licence  from  the  Controller  of  Her  Britannic  Majesty’s  Stationery 
Office  to  publish  the  Crown  Copyright  material  included  in  this 
report. 

This  report  will  be  of  interest  to  military  planners,  operators, 
and  personnel  charged  with  assessing  the  effects  of  alternative  infor- 
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mation  network  structures,  processing  facilities,  and  dissemination 
procedures.  Planners  contemplating  the  use  of  network-centric  pro¬ 
cesses  to  achieve  military  objectives  can  use  the  methods  described  in 
the  report  to  evaluate  alternative  structures  and  processes.  Informa¬ 
tion  technologists  can  assess  the  contribution  of  each  alternative  to 
the  decisionmaker’s  knowledge  prior  to  taking  a  decision.  The  ulti¬ 
mate  goal  is  to  develop  tools  that  will  allow  operators  to  quickly 
evaluate  plans  for  their  level  of  situational  awareness. 

For  more  information  on  the  RAND  International  Security  and 
Defense  Policy  Center,  contact  the  director,  James  Dobbins.  He  can 
be  reached  by  email  at  James_Dobbins@rand.org;  by  phone  at  310- 
393-0411,  extension  5134;  or  by  mail  at  RAND  Corporation,  1200 
South  Hayes  Street,  Arlington,  VA,  22202-5050.  More  information 
about  RAND  is  available  at  www.rand.org. 
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Summary 


New  information  technologies  introduced  into  military  operations 
provide  the  impetus  to  explore  alternative  operating  procedures  and 
command  structures.  New  concepts  such  as  network-centric  opera¬ 
tions  and  distributed  and  decentralised  command  and  control  have 
been  suggested  as  technologically  enabled  replacements  for  platform¬ 
centric  operations  and  for  centralised  command  and  control.  As 
attractive  as  these  innovations  seem,  it  is  important  that  military 
planners  responsibly  test  these  concepts  before  their  adoption.  To  do 
this,  models,  simulations,  exercises,  and  experiments  are  necessary  to 
allow  proper  scientific  analysis  based  on  the  development  of  both 
theory  and  experiment. 

The  primary  objective  of  this  work  is  to  propose  a  theoretical 
method  to  assess  the  effects  of  information  gathering  and  collabora¬ 
tion  across  an  information  network  on  a  group  of  local  decision¬ 
making  elements  (parts  of,  or  a  complete,  headquarters).  The  effect  is 
measured  in  terms  of  the  reduction  in  uncertainty  about  the  informa¬ 
tion  elements  deemed  critical  to  the  decisions  to  be  taken. 

Our  approach  brings  together  two  sets  of  ideas,  which  have  been 
developed  thus  far  from  two  rather  different  perspectives.  The  first  of 
these  sets  is  the  Rapid  Planning  Process,  developed  as  part  of  a  project 
on  command  and  control  in  operational  analysis  models  within  the 
UK  Ministry  of  Defence  Corporate  Research  Programme.  It  is  a  con¬ 
struct  for  representation  of  the  decisionmaking  of  military  com¬ 
manders  working  within  stressful  and  fast-changing  circumstances. 
The  second  set  of  ideas  comes  from  the  work  on  modelling  the  effects 
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of  network-centric  warfare,  carried  out  recently  by  the  RAND 
Corporation  for  the  US  Navy.  We  assess  the  effects  of  collaboration 
across  alternative  information  network  structures  in  prosecuting  a 
time-critical  task  using  a  spreadsheet  model.  We  quantify  the  benefits 
and  costs  of  local  collaboration  using  a  relationship  based  on 
information  entropy  as  a  measure  of  local  network  knowledge.  We  also 
examine  the  effects  of  complexity  and  information  overload  caused  by 
such  collaboration. 


Decisions  in  a  Network 

New  technologies  are  enabling  militaries  to  leverage  information 
superiority  by  integrating  improved  command  and  control  capabili¬ 
ties  with  weapon  systems  and  forces  through  a  network-centric 
information  environment.  The  result  is  a  significant  improvement  in 
awareness,  shared  awareness,  and  collaboration.  These  improvements 
in  turn  affect  the  quality  of  the  decisionmaking  process  and  the  deci¬ 
sion  itself,  which  ultimately  lead  to  actions  that  change  the  battle- 
space. 

In  this  report,  we  focus  on  the  quality  of  the  decisions,  or  the 
planned  outcome,  rather  than  on  whether  or  not  the  desired  effect  is 
eventually  achieved. 

We  note  that  decisions  are  made  based  on  the  information  avail¬ 
able  from  three  sources:  information  that  is  resident  at  the  decision 
node;  information  from  collection  assets  and  information  processing 
facilities  elsewhere  in  the  network;  and  information  from  other  local 
decisionmakers  with  whom  the  decision  nodes  are  connected  and 
with  whom  they  share  information. 

Rapid  Planning  Process 

In  most  cases,  decisionmakers  must  make  decisions  without  full 
understanding  of  the  values  of  the  critical  information  elements 
needed  to  support  the  decisions.  The  decision  taken  depends  on  the 
current  values  of  the  critical  information  elements,  which  are  depen¬ 
dent  on  the  scenario.  This  dependency  is  modelled  using  the  Rapid 
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Planning  Process.  The  critical  information  elements  map  out  the 
commander’s  conceptual  space.  In  the  basic  formulation  of  the  Rapid 
Planning  Process,  a  dynamic  linear  model  is  used  to  represent  the 
decisionmaker’s  understanding  of  the  values  of  these  factors  over 
time.  This  understanding  is  then  compared  with  one  or  more  of  the 
fixed  patterns  within  the  commander’s  conceptual  space,  leading  to  a 
decision. 

A  probabilistic  information  entropy  model  is  used  to  represent 
the  uncertainty  associated  with  the  critical  information  elements 
needed  for  the  decision.  Ideally,  through  the  Rapid  Planning  Process, 
additional  information  from  collection  assets  or  from  collaborating 
elements  in  the  network  serves  to  reduce  uncertainty  and  therefore 
increase  knowledge. 

Knowledge 

We  are  principally  concerned  with  the  information  and  cognitive 
domains,  as  depicted  in  Figure  S.l.  The  domains  of  the  information 
superiority  reference  model  divide  the  command  and  control  cycle 
into  relatively  distinct  segments  for  ease  of  analysis.  Their  description 
includes  the  entities  resident  in  the  domain,  the  procedures  per¬ 
formed  and  the  products  produced  there,  and  the  relationships 
among  the  domains. 

Information  derived  from  sensors  or  other  information  gather¬ 
ing  resides  in  the  information  domain.  This  information  is  trans¬ 
formed  into  awareness  and  knowledge  in  the  cognitive  domain  and 
forms  the  basis  of  decisionmaking.  Our  metrics  quantify  this  process 
through  the  use  of  information  entropy  and  knowledge  measures. 

Information  sharing  among  nodes  ideally  tends  to  lower  infor¬ 
mation  entropy  (and  hence  increase  knowledge)  partly  because  of  the 
buildup  of  correlations  among  the  critical  information  elements.  That 
is,  information  can  be  gained  about  one  critical  information  element 
(e.g.,  missile  type)  from  another  (e.g.,  missile  speed).  Such  cross  cou¬ 
pling  is  a  key  aspect  for  consideration,  and  we  use  conditional  en¬ 
tropy  to  capture  these  relationships. 
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Figure  S.1 

The  Information  Superiority  Reference  Model 


Cognitive  domain 
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Knowledge  derived  from  entropy  is  a  quantity  that  reflects  the 
degree  to  which  the  local  decisionmaker  understands  the  values  of  the 
information  elements.  It  is  represented  as  a  number  between  0  and  1, 
with  the  former  representing  ‘no  understanding’  and  the  latter  repre¬ 
senting  ‘perfect  understanding’.  From  this  knowledge,  decision¬ 
makers  can  assess  whether  or  not  they  are  in  their  ‘comfort  zone’ — 
that  is,  whether  the  values  of  the  key  information  elements  support 
the  decision  they  wish  to  take  (such  as  one  to  launch  the  next  attack 
mission). 
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Effects  of  Collaboration 

Networks  provide  an  opportunity  for  participating  entities  to  share 
information  as  part  of  a  collaborative  process. 1  Here  we  focus  on  the 
synergistic  effects  of  collaboration  that  improve  the  quantity  (the 
completeness  of  our  information)  and  the  quality  (its  precision  and 
accuracy)  of  the  information  needed  to  take  decisions.  We  model  the 
network  as  the  combination  of  clusters  of  entities  and  represent  each 
entity  by  a  node.  A  cluster  consisting  of  a  single  node  is  taken  to  be 
the  degenerate  case.  Each  such  cluster  consists  of  a  set  of  entities, 
which  have  full  shared  awareness.  Full  shared  awareness  means  that  all 
entities  in  the  cluster  agree  on  the  set  of  information  elements  and 
their  values  at  any  given  time. 

Estimators 

Through  observations  of  the  battlespace,  sensors  and  other  informa¬ 
tion  sources  generate  estimates  for  the  information  elements  deemed 
critical  to  the  decision.  The  uncertainty  associated  with  the  informa¬ 
tion  elements  is  expressed  in  terms  of  probability  distributions,  the 
means  of  which  are  estimates  of  the  ground-truth  values.  Because  the 
mean  of  a  probability  distribution  is  a  parameter  of  the  distribution, 
we  turn  to  parameter  estimation  theory  to  assess  the  quality  of  the 
information  available  to  the  decisionmaker  and  examine  how  the 
quality  of  the  estimates  contributes  to  knowledge. 

•  Bias:  Bias  in  an  estimate  is  error  introduced  by  systematic  distor¬ 
tions.  An  unbiased  estimator  is  one  for  which  its  statistical 
expectation  is  the  true  value  of  the  estimated  parameter.  That  is, 
the  expected  value  of  the  estimate  of  the  parameter,  (1,  is  the 
true  value  of  the  parameter,  p .  The  bias  in  the  estimate  is  there¬ 
fore  the  degree  to  which  this  is  not  true. 

•  Precision:  The  variation  in  estimates  of  the  critical  information 
elements  can  occur  in  a  purely  random  way.  Random  errors 


1  Collaboration  in  this  context  is  taken  to  be  a  process  in  which  operational  entities  actively 
share  information  while  working  together  towards  a  common  goal. 
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affect  the  precision  of  the  estimates  reported  because  they 
increase  the  variance  of  the  distribution  of  the  estimated  infor¬ 
mation  element.  In  general,  precision  is  defined  to  be  the  degree 
to  which  estimates  of  the  critical  information  element  or  ele¬ 
ments  are  close  together.2  Bias  and  precision,  therefore,  are 
independent — that  is,  biased  estimates  may  or  may  not  be  pre¬ 
cise. 

Precision  and  Entropy 

The  amount  of  information  available  in  a  probability  density  is  meas¬ 
ured  in  terms  of  information  entropy,  denoted  H(x).  Information 
entropy  is  always  a  function  of  the  distribution  variance,  and  there¬ 
fore  we  use  it  as  the  basis  for  developing  a  knowledge  function.  For 
example,  the  bivariate  normal  distribution  is  H{x,y)  =  log  |  £| ,  where 
£  is  the  covariance  matrix.  From  this,  we  create  a  precision-based 
knowledge  function  as3 

|x| 

\z\ 

I  I  max 

where  |£|max  is  the  determinant  of  the  covariance  matrix  that  pro¬ 
duces  the  maximum  uncertainty.  Based  on  precision  alone,  K{x,y) 
reflects  the  level  of  understanding  within  a  cluster  of  decisionmakers. 

For  the  simple  case  of  two  collaborating  decisionmakers  (i.e., 
two  nodes  of  the  network  forming  a  cluster)  who  share  two  pieces  of 
information  with  a  multivariate  normal  distribution,  the  change  in 
knowledge  is  given  by 


2  This  is  a  commonly  accepted  definition.  Ayyub  and  McCuen  (1997,  p.  191)  define  preci¬ 
sion  as  ‘the  ability  of  an  estimator  to  provide  repeated  estimates  that  are  very  close  together’. 
A  similar  definition  can  be  found  in  Pecht  (1995). 

3  Actually,  the  exact  entropy  value  for  the  bivariate  normal  case  is  H(x,y)  =  log|(2^?)  X| . 
However,  because  we  are  concerned  about  the  relative  entropy,  we  use  the  simpler  version, 
which  we  refer  to  as  ‘relative  entropy’. 
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.2  _2  _2 

A K=  Pu  1  2  , 

2  2  ’ 

l,max  2, max 

where  Pi,2  is  the  correlation  coefficient,  af,a2  are  the  variances,  and 
ar,max’02,max  are  the  maximum  or  bounding  values  on  the  variance  for 
the  two  pieces  of  information. 


Accuracy 

Accuracy  is  the  degree  to  which  the  estimates  of  the  critical  informa¬ 
tion  elements  are  close  to  ground  truth.  The  concept  of  accuracy 
comprises  both  precision  and  bias.  In  general,  if  a  is  an  information 
element  whose  value  x  is  unknown  with  probability  distribution  /(x) 
and  mean  p  representing  ground  truth,  then  the  bias  associated  with 
the  estimate  of  the  mean  is  £=|E(|i)-p|,  where  |i  is  the  estimate  of 
the  mean.  Because  accuracy  consists  of  both  bias  and  precision,  we 
therefore  need  a  metric  that  combines  both.  One  such  metric  is  the 
mean  square  error  (MSE),  E[(ji-|i)2]  =  b2  +  G2 ,  where  a2  is  the 
variance  of  jl.  The  MSE  is  an  extremely  useful  metric  because  it 
includes  both  accuracy  in  the  total  and  precision  as  a  component.  In 
estimating  ground  truth,  the  bias  accounts  for  nonrandom  errors  and 
the  precision  accounts  for  random  errors. 

We  illustrate  by  continuing  with  the  bivariate  normal  case.  We 
assume  that  Bayesian  updating  is  used  to  refine  the  location  estimate 
based  on  the  arriving  reports.  Bayesian  updating  is  not  always  un¬ 
biased,  and  therefore  we  introduce  systemic  error.  In  this  case,  the 
bias  is  the  Euclidean  distance  between  the  Bayesian  estimate  and  the 
ground-truth  value: 


V 
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B, 
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By  analogy  with  the  MSE,  the  accuracy  of  the  estimate  is  defined  as 
D(x,y)  =  b2  + 1 X  |. 
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The  Effects  of  Bias,  Precision,  and  Accuracy  on  Knowledge 

We  now  account  for  bias,  precision,  and  hence  accuracy  in  the 
knowledge  function  by  replacing  the  distribution  variance  with  the 
MSE,  or  the  accuracy  measure  D(x,y)  in  the  knowledge  function. 
Therefore,  for  the  multivariate  normal  case,  we  get  a  modified  knowl¬ 
edge  function  of  the  form:4 


Mx) 


b2  +  |E 


The  ‘maximum  mean  square  error’  is  a  combination  of  the  maximum 
bias  and  the  maximum  precision  and  represents  the  maximum  in 
inaccuracy.  Because  bias  and  precision  are  independent,  the  maxi¬ 
mum  occurs  when  both  are  maximised,  or  (£2+|  E|)max  =^+1^ Lax- 
Like  the  variance,  a  suitable  upper  bound  for  bias  can  be  found  by 
searching  for  the  largest  possible  measurement  error  the  sensors  or 
sources  might  produce. 


Completeness 

In  addition  to  precision  and  accuracy,  collaboration  also  affects  the 
completeness  of  the  critical  information  elements  available  within  a 
cluster.  For  the  entire  network,  we  assume  there  are  a  maximum  of  N 
critical  information  elements.  For  a  given  cluster,  the  total  number 
required  is  C  <N .  However,  at  a  given  time,  t,  only  n<C  might  be 
available.  If  waiting  for  additional  reports  is  not  possible,  a  decision¬ 
maker  would  be  required  to  take  a  decision  without  benefit  of  com¬ 
plete  information.  Depending  on  his  experience  and  other  contextual 
information,  the  decisionmaker  may  be  able  to  infer  some  likely  less 
reliable  value  for  the  missing  information.  For  now,  we  assume  that  if 
the  value  of  an  information  element  is  missing,  the  value  of  com¬ 
pleteness  at  cluster  i  is 


4  The  subscript  M  denotes  knowledge  derived  from  the  MSE. 
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where  ^  is  a  ‘shaping’  factor.  For  values  of  £<1,  the  curve  is  con¬ 
caved  downwards;  for  \  >  1,  it  is  concaved  upwards;  and  for  ^  =  1 ,  it  is 
a  straight  line.  The  selection  of  the  appropriate  value  depends  on  the 
consequences  associated  with  being  forced  to  take  a  decision  with 
incomplete  information  as  well  as  the  commander’s  attitude  to  risk. 

Information  Freshness 

A  final  consideration  when  assessing  uncertainty  is  that  of  freshness. 
The  information  arriving  at  a  decision  node  consists  of  reports  con¬ 
cerning  one  or  more  of  the  critical  information  elements  necessary  to 
take  a  decision.  Both  precision  and  accuracy  depend  on  the  joint 
probability  density  function  that  reflects  the  uncertainty  in  our 
knowledge  of  the  ground-truth  fixed  pattern  at  a  decision  node. 
These  reports  are  used  to  update  the  joint  probability  distribution  of 
the  information  elements  and  hence  the  probability  of  correctness  of 
each  of  the  fixed  patterns  in  the  local  decisionmaker’s  conceptual 
space. 

We  have  selected  Bayesian  updating  as  the  method  for  combin¬ 
ing  reports  from  various  sources  and  sensors.  All  things  being  equal, 
we  desire  to  give  more  weight  to  more  recent  reports,  which  requires 
that  we  reevaluate  all  available,  valid  reports  at  the  time  a  decision  is 
to  be  taken.  A  time-lapse  estimate,  0  <  O  <  1 ,  is  used  to  determine  the 
rate  of  information  decay  so  that  old  information  is  given  less  weight 
than  current  information. 

Measuring  the  Overall  Effect  of  Cluster  Collaboration 

Finally,  we  combine  the  currency-adjusted  precision  and  accuracy 
knowledge  function  with  completeness  to  arrive  at  a  single  metric  to 
assess  the  effects  of  collaboration  across  the  cluster.  The  ideal  case  is 
when  we  have  full  completeness,  i.e.,  Xt(n)  =  Xt(C)  =  1,  and  the 
knowledge  shared  across  the  cluster  is  fully  accurate,  KM(x)  =  1. 
Unfortunately,  this  ideal  is  seldom,  if  ever,  achieved.  Consequently, 
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we  require  a  construct  that  gauges  the  degree  to  which  accuracy,  as 
calculated  here,  and  completeness  contribute  to  knowledge. 

In  general,  when  Xt{n)  is  small,  the  knowledge  function  should 
also  be  small.  One  way  to  reflect  this  behaviour  is  to  replace  the  MSE 
in  the  entropy  calculation  with 

b  +G 

W 

This  equation  has  the  desirable  property  that,  when  Xt{n)  — >1.0,  the 
ratio  is  just  the  MSE,  and  when  Xt(n) —>  0,  it  increases  without 
bound.  Because  n  is  discrete,  we  can  select  n  =  1  to  be  the  worse  case, 
with  Xt{\ )  =  C~S.  Consequently,  the  upper  bound  on  the  resultant 
entropy  calculation  is 

^max  ^max  _  ( f 'J-  i  ^  \ 

^  \ymax  '  U  max  j  • 


If  C  =  1 ,  there  is  no  effect  on  the  current  entropy  calculation  or  on 
the  maximum  entropy.  If  we  let  Kk{x)  be  the  knowledge  within  the 
cluster  based  on  accuracy  and  completeness,  with  the  maximum 
variance  replaced  with  Cs(^ax  +o^ax),  we  get 


Kk{x)  =  1- 


b2+a2 


»5(4L+o 


2 

max  I 


for  the  univariate  normal  case.5 

Up  to  this  point,  we  have  captured  the  effects  of  collaboration 
among  decision  nodes  within  a  cluster  on  knowledge.  The  measured 
effects  of  information  sharing  through  collaboration  are  accuracy  and 
completeness.  For  the  most  part,  these  effects  are  dynamical,  because 
they  vary  with  the  quality  and  quantity  of  reports  received  and  pro¬ 
cessed  over  time.  Missing  from  this  analysis  so  far  is  an  assessment  of 


5  The  K  subscript  in  this  case  refers  to  knowledge  based  on  both  the  MSE  and  completeness. 
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the  systemic  effects  of  the  network  structure — that  is,  the  effects  that 
are  more  static.  Next,  we  take  up  such  measures  of  network  com¬ 
plexity  and  combine  them  with  the  collaborative  effects  to  arrive  at  a 
single  measure  of  network  performance  and  its  effect  on  decision¬ 
making. 


Effects  of  Structural  Complexity 

All  networks  exhibit  complexity  to  a  greater  or  lesser  degree.  Military 
command  and  control  systems  operating  in  a  network-centric  envi¬ 
ronment  also  exhibit  complex  behaviour.  The  challenge  is  under¬ 
standing  exactly  what  the  complexity  is,  what  its  effects  are,  and  how 
to  quantify  these  effects.  We  note  that  there  are  both  good  and  bad 
effects  of  complexity.  Unfortunately,  the  term  ‘complexity’  has  a 
negative  connotation;  therefore,  we  have  adopted  Murray  Gell- 
Mann’s  more  neutral  term,  ‘plecticity’. 

In  this  context,  plecticity  refers  to  the  ability  of  a  connected  set  of 
actors  to  act  synergistically  via  the  connectivity  between  them.  This 
measure  is  intended  to  take  into  account  the  fact  that  there  may  be 
constraints,  due  to  technical  or  procedural  limitations,  on  how  nodes 
can  constructively  connect  to  other  nodes;  that  is,  a  node’s  connec¬ 
tivity  can  add  costs  as  well  as  benefits  to  the  cluster.  A  measure  of 
plecticity  should  account  for  the  value  of  the  cluster’s  ability  to  glean 
information  from  throughout  the  network  to  fulfil  its  particular  func¬ 
tions,  include  a  means  for  measuring  the  value  of  information  redun¬ 
dancy,  and  reflect  a  cost  to  network  effectiveness  if  nodes  are  over¬ 
whelmed. 

For  networks  with  inadequate  clustering,  as  with  excessive 
clustering — flows  1  and  3,  respectively,  in  Figure  S.2 — we  would 
expect  low  plecticity  scores.  The  goal  is  to  configure  the  information 
flow  over  a  network  with  established  link  connectivity  so  as  to  maxi¬ 
mise  plecticity  as  measured  in  the  terms  discussed  above  and  as  illus¬ 
trated  by  flow  2  in  the  figure. 
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Figure  S.2 

Overall  Network  Plecticity 
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Accessing  Information 

The  metric  developed  for  completeness  earlier  is  simply  a  ratio  of 
counts:  available  required  information  elements  to  total  required 
information  elements.  No  attempt  is  made  to  assess  the  degree  to 
which  we  can  really  expect  to  receive  the  information  element,  i.e.,  the 
degree  to  which  the  network  allows  the  cluster  to  access  information 
in  the  network.  A  metric  that  does  so  is  the  ratio  of  the  aggregate 
expected  degree  of  critical  information  access  to  the  total  number  of 
required  information  elements.  Such  a  metric  accounts  for  the  uncer¬ 
tainties  associated  with  retrieving  needed  information. 

We  thus  replace  the  binary  accounting  for  information  ele¬ 
ments,  with  a  connectivity  score  based  on  a  distance  function  that 
recognises  the  cost  imposed  by  the  path  the  information  must  take 
through  the  network  to  arrive  at  the  node  requiring  it. 

For  any  information  element,  at ,  we  are  interested  in  the  shor¬ 
test  path  from  source  node  to  destination  node,  dl  >  1 ,  however 
calculated.  The  restriction  that  the  path  distances  always  exceed  1.0 
accounts  for  the  fact  that,  for  connectivity  to  exist  at  all,  at  least  one 
link  must  exist  between  source  and  destination.  The  case  in  which  no 
links  exist  implies  an  infinitely  long  path  resulting  in  0  connectivity. 
The  quantity,  dl ,  represents  the  expense  incurred  by  moving  infor¬ 
mation  element  al  from  source  to  destination.  The  associated  con¬ 
nectivity  value  is  calculated  as 
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k,  = 


d?1 


where  (0/  >  1  is  the  rate  at  which  kt  varies  with  changing  values  of  the 
distance  function. 

The  strength  of  the  connectivity  among  all  the  nodes  in  such  a 
path  must  take  into  account  the  vulnerability  of  path  elements  (links 
and  nodes)  to  attack  or  failure.  We  can  do  this  using  the  connectivity 
score  described  above  by  examining  its  value  as  we  remove  each 
node — link  or  both — one  at  a  time  from  a  given  path.  For  simplicity, 
we  consider  only  the  loss  of  nodes.  We  create  a  depletion  vector,  L/; 
whose  elements  consist  of  the  connectivity  values  for  information 
element  a, ,  with  each  of  the  path  nodes  removed  in  turn.  The  vector 
L/  then  represents  the  vulnerability  of  the  path  and,  as  such, 
expresses  the  degree  of  uncertainty  associated  with  retrieving  informa¬ 
tion  element  at  from  network  sources.  The  adjusted  connectivity  for 
information  element  al  from  network  sources  to  a  single  destination 
is  calculated  to  be 


ki  —  kt 


1-i 


where  |  |  is  the  cardinality  of  the  vector  L/  and  p  is  the  edge 

expansion  parameter  of  the  network,  which  measures  the  overall 
robustness  and  reliability  of  the  network.  The  resulting  formula  for 
accessibility,  X(k),  is 


x(k)  = 


C*  0 

y 

1  otherwise 


where  k  =  X/l,  kt  and  C  is,  as  before,  the  total  number  of  information 
elements  critical  to  the  cluster. 
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Benefits  of  Network  Redundancy 

Network  redundancy  focuses  on  the  reliability  of  the  network;  its 
ability  to  deliver  information  in  the  face  of  node  loss;  system  outages; 
inefficient  operating  procedures;  or  some  combination  of  all  these 
elements.  At  the  same  time,  a  network  can  deliver  excessive  informa¬ 
tion,  thus  causing  delays  because  of  the  time  and  resources  required 
to  process  all  of  it.  Consequently,  network  redundancy  can  be  both  a 
cost  and  a  benefit  of  the  network  information  flow. 

Needed  information  can  be  provided  to  a  cluster  from  multiple 
sources.  If  the  value  of  the  information  will  change  over  time,  we  can 
expect  multiple  reports  from  each  source.  These  multiple  reports 
require  combining  in  some  way  as  previously  discussed  under  col¬ 
laboration.  Whatever  method  is  used,  the  degree  to  which  the  reports 
contribute  to  estimates  close  to  ground  truth  and  to  a  narrowing  of 
the  distribution  variance,  a  benefit  will  accrue  to  the  cluster  because 
of  redundancy.  Recall  that  the  total  number  of  required  information 
elements  across  the  whole  network  is  N;  the  number  critical  to  a  clus¬ 
ter  is  C,  where  C  <N;  and  the  number  of  these  available  within  the 
cluster  is  n,  where  n<C .  If  we  let  the  vector  0  =  [01,02,---,0c]r 
represent  the  aggregate  value  of  reports  received  for  each  required 
information  element  (ava2,---,ac)  from  P  =  [A>A>'">A;P  sources, 
then  we  can  construct  a  suitable  normalised  aggregate  metric,  R(@) , 
as 


/?(©)  =  l--lf=Iy,e 


-s,-(e,-i) 


where  y,  =1  if  p{>  1  and  0  otherwise.  We  let  r;(0,)  be  the  benefit 
accruing  from  obtaining  reports  on  the  value  of  information  element 
a;  from  p-  sources  where  0  =YfLQ:  ,,  and  0,  ,  e  [1,°°)  measures  the 

ill  i  ““  J — 1  1 1 J 

assessed  reliability  of  the  report  on  information  element  ai  from 
source  Sj.  The  parameter,  §■,  reflects  the  relative  importance  of  the 
information  element,  ai . 

The  combined  benefit  of  information  redundancy  information 
to  the  cluster,  based  on  the  conditional  dependency  between  accessi¬ 
bility  and  redundancy,  is 
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(P-1)[kX'(^)  +  P/?(0)] 
■  (p-K)[p-*(*)] 


where  (3  >  1  is  a  constant  that  ensures  a  nonzero  denominator  and 
K>0  is  another  constant  that  ensures  that  the  combined  metric  is 
bounded  between  0  and  1 . 


Costs  of  Information  Overload 

At  the  same  time,  a  network  can  deliver  excessive  information.  The 
more  sources  of  required  information  and  the  more  frequent  the 
reporting,  the  longer  it  takes  for  the  cluster  to  get  a  coherent  view  of 
the  situation.  That  is,  it  takes  time  to  process  information,  which  may 
or  may  not  contribute  to  improving  the  quality  of  the  estimates.  This 
excess  is  referred  to  as  ‘information  overload’.  In  addition,  some  of 
the  sources  may  provide  disconfirming  evidence.  The  value  of  the 
disconfirming  evidence  can  be  good  or  bad,  depending  on  the  degree 
to  which  it  reflects  ground  truth.  Disconfirming  evidence  requires 
time  to  evaluate  and  therefore  may  increase  uncertainty  and  decrease 
the  quality  of  the  estimates.  Finally,  it  is  also  possible  that  raw  data 
may  be  processed  before  being  sent,  thus  arriving  at  the  cluster  as 
time-stamped  information  with  the  time  at  which  the  processing 
ended.  This  possibility  introduces  an  artificial  latency  that  contributes 
to  uncertainty. 

The  supply  of  unneeded  information  to  a  cluster  has  an  imme¬ 
diate  negative  impact,  because  it  must  be  processed  or,  at  a  mini¬ 
mum,  interferes  with  the  receipt  of  needed  information.  However,  as 
more  unneeded  information  is  supplied,  its  impact  is  reduced.  Thus, 
a  good  function  to  model  this  behaviour  is  the  exponential 
U{m)  =  l-e~Vm,  where  m  is  the  number  of  sources  of  unneeded  infor¬ 
mation  and  V  is  a  scaling  parameter. 

The  costs  of  information  overload  associated  with  needed 
information  within  a  cluster  are  generally  minimal  for  low  levels  of 
redundancy.  Indeed,  at  these  levels,  the  benefits  far  outweigh  the 
costs,  as  discussed  earlier.  However,  at  some  point,  costs  rise  sharply 
so  that  the  marginal  cost  of  an  additional  source  of  information  is 
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greater  than  the  previous  source.  At  some  further  point,  this  cost  then 
levels  off  so  that  the  marginal  costs  are  minimal.  This  behaviour  is 
best  described  using  a  logistics  response  function  for  each  information 
element  shared  within  the  cluster.  For  simplicity,  we  express  the 
combined  costs  of  oversupply  of  needed  information  as  a  simple  sum, 

An+vipi) 

I'=l5'1  +  ,-(  Xi+OiPiV 

where  %t  and  cp  ■  are  shaping  parameters. 

In  considering  the  overall  costs  for  the  cluster,  a  balance  is  struck 
between  costs  of  needed  and  unneeded  information.  We  use  a  simple 
weighted  linear  sum  of  the  two  components  of  information  overload, 
or  0[U(m),G(P)]  =  aU(m)  +  (\-a)G(P) ,  where  0<a<l,  as  a  relative 
weight  parameter. 

Redundancy-Based  Plecticity 

The  next  step  is  to  combine  the  costs  and  benefits  of  plecticity  for  a 
cluster  associated  with  the  mission  at  hand.  For  each  cluster  in  the 
network,  the  measure  of  network  plecticity,  C(B,0),  is  calculated  as 
follows: 

C(B,  O )  =  5[*(0)|*(*)] [l  -O [ U{m ),  G(P)]] . 


Network  Performance 

The  last  step  is  to  combine  the  redundancy-based  plecticity  with  the 
benefits  of  collaboration  across  all  the  clusters  of  the  network.  Our 
collaboration  metric  quantifies  the  effects  of  information  sharing 
across  a  cluster  on  information  completeness  and  accuracy,  whereas 
plecticity  measures  the  positive  and  negative  effects  of  redundant 
information  and  the  degree  of  information  access.  The  former  assesses 
the  dynamic  nature  of  the  operation  conducted  on  the  network;  the 
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latter  measures  the  effects  of  the  underlying  network  structure  and  is 
therefore  systemic.  All  the  dependencies  among  the  several  compo¬ 
nents  of  collaboration  and  plecticity  are  not  generally  well  under¬ 
stood.  However,  we  know  that  high-quality  performance  requires 
good  cluster  knowledge  and  the  means  to  share  it  and  that  scores  in 
either  category  are  penalised  by  deficiencies  in  the  other.  Therefore, 
the  measure  of  total  network  performance  is  taken  to  be 

Q(n,K7V)=lf=1[c,(s,o)ir;,K]“', 

where  Xf=1  CO,  =  1  and  L  is  the  number  of  clusters. 

For  values  of  Q(n,K7V)  close  to  1.0,  the  network  is  performing 
well  by  producing  the  information  required  to  take  decisions  within 
each  of  the  clusters  when  required.  However,  this  is  not  the  whole 
story.  The  next  step  is  to  assess  how  well  the  combat  mission  is 
accomplished.  As  important  as  good  decisions  are,  good  combat  out¬ 
comes  are  the  ultimate  measure  of  the  value  of  network-centric  opera¬ 
tions.  An  example  application  shows  how  these  approaches  can  be 
combined.  The  mathematical  approach  is  used  to  filter  out  preferred 
network  and  clustering  assumptions,  which  are  then  tested  in  a 
simulation  environment.  This  allows  the  development  of  both 
network-based  Measures  of  Command  and  Control  Effectiveness  and 
higher-level  Measures  of  Force  Effectiveness. 
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AABde 

Accuracy 

ACP 
AH  Regt 
Armd  Bde 
Armd  Div 
Awareness 
Bias 

BSA 

C4ISR 

CEC 


CEP 

Cluster 

CMM 

CoA 


Air  Assault  Brigade 

The  degree  to  which  information  agrees  with 
ground  truth 

Ammunition  Control  Point 
Attack  Helicopter  Regiment 
Armoured  Brigade 
Armoured  Division 
A  realisation  of  the  current  situation 

Error  in  an  estimate  introduced  by  systematic 
distortions 

Brigade  Supply  Area 

command,  control,  communications,  computers, 
intelligence,  surveillance,  and  reconnaissance 

Cooperative  Engagement  Capability;  a  capability 
that  combines  data  from  all  platforms  in  an 
operation  and  allows  the  combined  data  to 
produce  a  better  shared  CROP 

circular  error  probable 

A  set  of  network  nodes  possessing  full  shared 
awareness 

Collaboration  Metric  Model 
course  of  action 
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Collaboration 


Complexity 


Conceptual  space 

CROP 

DLM 

DSA 

Dstl 

FOB 

FSG 

Full  shared 
awareness 


Information 

entropy 

Information 

superiority 


IPB 

Knowledge 

Logically 
connected  nodes 


A  process  in  which  operational  entities  actively 
share  information  while  working  together  towards 
a  common  goal 

The  condition  of  having  several  interrelated  parts 
in  a  network  with  several  interrelated  operational 
entities.  Kolmogorov  definition:  The  length  of  the 
shortest  binary  program  needed  to  compute  a 
string  of  data;  the  minimal  description  length 

The  conceptual  space  of  a  commander  is  the  space 
defined  by  the  values  of  his  critical  information 
requirements 

common  relevant  operating  picture;  a  view  of  the 
battlespace  shared  by  all  friendly  forces 

dynamic  linear  model 

Divisional  Supply  Area 

Defence  Science  and  Technology  Laboratory 

Forward  Operating  Base 

Forward  Support  Group 

A  set  of  network  nodes  that  (1)  share  information, 
(2)  agree  on  the  same  set  of  critical  information 
elements,  and  (3)  agree  on  the  current  values  of 
the  agreed  critical  information  elements 

A  measure  of  the  average  amount  of  information 
in  a  probability  distribution  (also  referred  to  as 
Shannon  entropy) 

The  ability  to  collect,  process,  and  disseminate 
information  as  needed;  anticipate  changes  in  the 
enemy’s  information  needs;  and  deny  the  enemy 
the  ability  to  do  the  same 

intelligence  preparation  of  the  battlefield 

Accumulated  and  processed  information  wherein 
conclusions  are  drawn  from  patterns 

Nodes  with  a  communication  path  between  them 
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MADM 

multiple  attribute  decisionmaking 

Measures 

Standards  for  comparison 

Mech  Bde 

Mechanised  Brigade 

Metrics 

Mathematical  expressions  that  evaluate  both  the 
relative  effect  of  alternatives  and  the  degree  to 
which  one  is  better  or  worse  than  another 

MLRS  Regt 

Multiple-Launch  Rocket  System  Regiment 

MSE 

mean  square  error;  a  measure  of  the  accuracy  of  an 
estimate.  It  is  the  sum  of  the  bias  and  the 
precision  of  the  estimate 

Mutual 

The  amount  of  information  gained  about  random 

information 

variable  Abased  on  information  gained  about 
dependent  variable  Y 

NAI 

named  area  of  interest 

PCPR 

perceived  combat  power  ratio 

Physically 
connected  nodes 

Nodes  with  a  communications  link  between  them 

Plecticity 

The  ability  of  a  connected  set  of  actors  to  operate 
synergistically  via  the  connectivity  among  them 

Precedence 

weighting 

A  multi-attribute  decisionmaking  method 

Precision 

The  degree  to  which  multiple  observations  are 
close  together 

RPD 

Recognition  Primed  Decision 

SA 

situation  awareness 

SAW 

simple  additive  weights;  a  multi-attribute 
decisionmaking  method 

Shared  awareness 

The  ability  of  a  decisionmaking  team  to  share 
realisations 

TAI 

target  area  of  interest 

CHAPTER  ONE 


Introduction 


New  information  technologies  introduced  into  military  operations 
provide  the  impetus  to  explore  alternative  operating  procedures  and 
command  structures.  New  concepts  such  as  network-centric  opera¬ 
tions  and  distributed  and  decentralised  command  and  control  have 
been  suggested  as  technologically  enabled  replacements  for  platform¬ 
centric  operations  and  centralised  command  and  control.  As  attrac¬ 
tive  as  these  innovations  may  seem,  it  is  important  that  military  plan¬ 
ners  responsibly  test  these  concepts  before  their  adoption.  To  do  this, 
models,  simulations,  exercises,  and  experiments  are  necessary. 


Objective 

The  major  objective  of  this  work  is  to  produce  a  method  to  assess  the 
effects  of  information  gathering  and  sharing  across  an  information 
network  on  the  quality  of  decisions  taken  by  a  group  of  local  deci¬ 
sionmaking  elements  (parts  of,  or  a  complete,  headquarters).  The 
effect  is  measured  in  terms  of  the  reduction  in  uncertainty  about  the 
information  elements  deemed  critical  to  the  decisions  to  be  taken  at 
these  local  decisionmaking  elements.  We  are  thus  assuming  that  the 
set  of  information  elements  necessary  to  produce  a  local  conceptual 
picture  of  the  battlespace  is  known. 1  The  issue  here  is  the  degree  of 


1  Other  experimentally  based  research  work  in  the  United  Kingdom  is  considering  what 
these  factors  are  in  different  scenarios. 
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confidence  with  which  they  are  known,  as  measured  by  the  local  deci¬ 
sionmaking  element’s  level  of  knowledge. 

The  term  ‘knowledge’  has  several  meanings,  and  therefore  it  is 
important  that,  at  the  outset,  we  define  what  it  means  in  the  context 
of  the  decisionmaking  processes  described  in  this  work.  Formally,  we 
define  knowledge  to  be  accumulated  and  processed  information 
wherein  conclusions  are  drawn  from  patterns.  Information  elements 
accumulated  over  time  form  patterns  that  can  be  matched  to  known 
patterns.  The  more  reports  confirming  a  given  pattern,  the  less  uncer¬ 
tainty  remains  and  the  more  knowledge  is  gained. 


The  Information  Superiority  Reference  Model 

In  terms  of  the  categorisation  developed  by  Alberts  et  al.  (2001),  we 
are  representing  the  flow  of  information  about  the  physical  domain 
around  the  network  in  the  information  domain  and  its  effect  (in  terms 
of  knowledge,  situation  assessment,  shared  awareness,  and  decision¬ 
making)  in  the  cognitive  domain.  These  concepts  are  embodied  in  the 
information  superiority  reference  model  depicted  in  Figure  1.1.  Infor¬ 
mation  superiority  is  a  term  used  to  express  the  ability  of  one  side  in  a 
conflict  to  impose  its  will  over  the  other  based  on  superior  informa¬ 
tion  collection,  processing,  and  dissemination  capabilities.  Formally, 
we  define  information  superiority  to  be  the  ability  to  collect,  process, 
and  disseminate  information  as  needed;  anticipate  changes  in  the 
enemy’s  information  needs;  and  deny  the  enemy  the  ability  to  do  the 
same. 

Both  sides  in  a  conflict  generally  have  different  perceptions  of  a 
single  reality,  referred  to  as  the  situation.  Figure  1.1  shows  how  the 
three  domains  contribute  to  this  perception.  We  list  the  major  activi¬ 
ties  performed  in  each  of  the  domains  in  each  of  the  boxes.  The 
physical  domain  is  where  reality,  or  ground  truth,  resides.  In  addition 
to  physical  objects,  such  as  weapon  systems,  terrain  features,  and 
sensors,  this  domain  also  contains  intangibles,  such  as  enemy  intent, 
plans,  and  current  and  projected  activities.  A  complete  assessment  of 
the  situation  will  contain  estimates  about  each. 
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Figure  1.1 

The  Information  Superiority  Reference  Model 


Cognitive  domain 


RAND  MG226-T.  1 


In  the  information  domain,  data  are  extracted  from  the  physical 
domain  and  processed  to  form  structured  information  in  the  form  of 
a  common  relevant  operating  picture  (CROP).  Three  primary  func¬ 
tions  are  performed  in  the  information  domain:  collecting  data 
through  the  use  of  sensors  and  sources,  including  tasking  sensors,  to 
close  gaps  in  the  data;  processing  the  data  through  the  fusion  process 
to  produce  the  CROP;  and  disseminating  relevant  parts  of  the  CROP 
to  friendly  units.  The  last  step  contributes  to  the  collaboration  pro¬ 
cess  in  the  cognitive  domain  in  which  the  shared  CROP  is  trans¬ 
formed  into  a  shared  awareness  of  the  current  and  future  situation 
that  can  be  used  to  gain  understanding  of  threats  and  opportunities  as 
well  as  the  subsequent  decisionmaking  regarding  an  appropriate 
course  of  action.  Our  quantified  assessment  of  the  difference  due  to 
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local  collaboration  is  a  knowledge-based  metric  and  hence  resides  in 
the  cognitive  domain.2 

Finally,  the  human  activities  associated  with  using  the  informa¬ 
tion  available  to  form  an  estimate  of  the  situation  are  accomplished  in 
the  cognitive  domain.  To  the  extent  that  decisionmaking  teams  exist, 
they  collaborate  to  form  a  level  of  situational  awareness.  In  addition 
to  the  CROP  produced  in  the  information  domain,  individual  team 
members  and  the  decisionmaker  may  have  prior  information  from 
processes  like  the  intelligence  preparation  of  the  battlefield  (IPB) 
available  to  support  their  deliberations.  Finally,  the  decisionmaker  is 
likely  to  have  concerns  and  expectations  about  the  performance  of  his 
own  forces,  as  well  as  the  enemy’s,  that  would  colour  his  assessment 
of  the  situation  and  therefore  his  decision.  These  elements  are  de¬ 
picted  in  Figure  1.1  as  emanating  directly  from  the  physical  domain. 

This  report  documents  the  mathematical  constructs  and  metrics 
used  to  assess  the  effectiveness  of  the  various  operating  schemes  and 
command  arrangements. 


Research  Approach 

The  basis  of  our  approach  is  to  bring  together  two  sets  of  ideas,  which 
have  been  developed  thus  far  from  rather  different  perspectives.  The 
first  of  these  comes  from  the  work  performed  as  part  of  a  project  on 
command  and  control  in  operational  analysis  models  within  the  UK 
Ministry  of  Defence  Corporate  Research  Programme.  The  pro¬ 
gramme  aims  to  provide  the  Ministry  of  Defence  with  the  ability  to 
carry  out  fundamental  research  not  tied  to  particular  procurement 
programmes.  In  this  case,  it  has  led  to  the  development  of  the  Rapid 
Planning  Process  (Moffat,  2002)  as  a  construct  for  representation  of 
the  decisionmaking  of  military  commanders  working  within  stressful 
and  fast-changing  circumstances.  The  process  is  now  well  accepted 
and  has  been  included  in  a  number  of  key  command  and 


2  Collaboration  in  this  context  is  taken  to  be  a  process  in  which  operational  entities  actively 
share  information  while  working  together  towards  a  common  goal. 
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control-centred  simulation  models  developed  or  under  development 
by  the  UK  Defence  Science  and  Technology  Laboratory  (Dstl).  Such 
a  representation  approximates  to  the  ‘simple  decisionmaking’  of 
Alberts  et  al.  (2001)  in  which  the  information  elements  and  the  crite¬ 
ria  for  decision  are  known  and  a  satisficing  strategy  is  adopted. 

The  second  set  of  ideas  comes  from  the  work  on  modelling  the 
effects  of  network-centric  warfare,  carried  out  recently  by  the  RAND 
Corporation  for  the  US  Navy  (Perry  et  ah,  2002).  In  this  work,  the 
effects  of  collaboration  across  alternative  information  network  struc¬ 
tures  in  prosecuting  a  time-critical  task  can  be  assessed  using  a  spread¬ 
sheet  model.  The  benefits  and  costs  of  local  collaboration  are  quanti¬ 
fied  using  a  relationship  based  on  information  entropy  as  a  measure 
of  local  network  knowledge.  The  effects  of  network  complexity  and 
the  completeness  of  the  information  collected  are  also  reflected  in  the 
overall  assessment  of  the  quality  of  the  information  made  available  to 
the  decisionmakers. 

To  merge  these  two  ideas,  we  examine  the  decisionmaking  pro¬ 
cess  among  networked  headquarters.  We  postulate  that  improved 
decisions  are  contingent  on  increased  knowledge  and,  therefore,  on 
diminished  uncertainty.  The  pattern-matching  features  of  the  Klein 
Recognition  Primed  Decision  (RPD)  model  (Klein,  1989)  are  used  to 
match  current  estimates  of  critical  information  elements  to  the  deci¬ 
sionmaker’s  set  of  typical  situations  or  internalised  patterns.  A  match 
is  made  when  the  current  estimates  lie  within  the  comfort  zone  of  one 
of  the  typical  situations. 

There  are  several  analytic  techniques  available  that  are  able  to 
match  estimates  of  values  of  multiple  information  elements  to  sets  of 
typical  situations — variously  referred  to  as  pattern-matching  tech¬ 
niques  or  classification  processes.  In  this  work,  we  rely  on  the 
matching  algorithms  within  the  Rapid  Planning  Process  mentioned 
earlier  and  discussed  in  detail  in  Appendix  A.  The  decision  to  be 
taken  in  this  case  is  the  selection  of  an  appropriate  course  of  action 
based  on  the  closeness  of  the  current  critical  information  element 
estimates  to  one  of  the  typical  situations. 

Since,  in  a  military  operation,  a  rapid  decision  is  usually  desir¬ 
able,  the  focus  now  centres  on  the  means  used  to  collect  information 
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about  the  uncertain  critical  information  elements,  the  ease  with 
which  this  information  is  shared  among  participants  in  the  operation, 
the  quality  of  the  resulting  processed  information,  and  its  effect  on 
knowledge.  The  methodology  then  turns  to  examining  the  structure 
of  the  decision  networks  and  the  quality  and  quantity  of  the  collabo¬ 
ration  exercised  on  it,  and  how  both  contribute  to  overall  knowledge 
and,  by  extension,  better  decisions. 


Organisation  of  This  Report 

In  the  next  chapter,  we  set  forth  the  framework  for  thinking  about 
decisionmaking  in  a  network.  In  Chapter  Three,  we  address  the  un¬ 
certainties  associated  with  information  elements  needed  to  support 
decisions,  and  suggest  statistical  representations  that  include  a  knowl¬ 
edge  metric.  Chapter  Four  examines  the  effects  of  collaboration  on 
network  performance.  In  Chapter  Five,  we  explore  the  effects  of 
network  complexity  on  network  performance  and  combine  collabora¬ 
tion  and  network  complexity  to  achieve  a  single  metric  that  measures 
the  performance  of  networked  clusters  of  decision  nodes. 

In  addition,  we  include  three  appendixes.  Appendix  A  describes 
the  Rapid  Planning  Process,  and  Appendix  B  discusses  information 
entropy  used  in  the  development  of  the  knowledge  metric.  Finally, 
Appendix  C  describes  an  application  of  the  measures  and  metrics  dis¬ 
cussed  in  the  text  to  the  logistics  command  and  control  problem  dis¬ 
cussed  in  Chapter  Two.  Appendix  C  also  discusses  how  the  Measures 
of  Command  and  Control  Effectiveness,  examined  in  the  main  body 
of  this  report,  may  be  combined  with  combat  models  to  assess  the 
effects  of  increased  knowledge  on  force  effectiveness. 
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Decisions  in  a  Network 


Western  militaries  are  formulating  new  visions,  strategies,  and  con¬ 
cepts  that  rely  on  acute  situational  awareness,  the  transformation  of 
information  into  knowledge,  and  rapid,  secure  means  of  sharing 
knowledge.  They  seem  to  be  placing  great  reliance  on  networked 
forces  that  are  fully  integrated  with  joint,  national,  and  coalition  or 
allied  systems.  To  achieve  these  goals,  militaries  must  create  and  lev¬ 
erage  information  superiority.  It  is  foreseen  that,  under  some  circum¬ 
stances,  a  mix  of  command  and  control  capabilities  would  be  inte¬ 
grated  with  weapon  systems  and  forces  on  an  end-to-end  basis 
through  a  network-centric  information  environment  to  achieve  sig¬ 
nificant  improvements  in  awareness,  shared  awareness,  and  collabora¬ 
tion  (Alberts  et  ah,  2001;  Alberts  et  ah,  2002). 

The  ultimate  effect,  however,  is  on  the  quality  of  the  decision¬ 
making  process  and  the  decision  itself.  These  decisions  ultimately 
lead  to  actions  that  change  the  battlespace.  In  this  report,  we  are  thus 
concerned  with  the  quality  of  these  decisions,  i.e.,  the  planned  out¬ 
come,  rather  than  the  effect  in  the  physical  domain.  It  is  almost  an 
article  of  faith  that  a  richly  connected  network  of  decision  nodes  will 
perform  better  by  improving  the  quality  of  decisions.  However,  we 
need  to  quantify  this  benefit  as  well  as  consider  and  quantify  the 
downside  of  such  information  sharing  (such  as  the  effect  of  informa¬ 
tion  overload  and  the  problems  associated  with  resolving  discon- 
firming  evidence). 
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The  Decision  Model 

For  this  work,  we  assume  that  the  decisions  taken  by  the  various  deci¬ 
sion  elements  in  the  headquarters  network  are  selections  of  courses  of 
action  (CoAs)  in  response  to  the  perceived  situations.  The  CoAs  pre¬ 
scribe  actions  to  be  taken  in  the  event  that  the  situation  in  the  battle- 
space  deviates  from  what  is  expected.  Ideally,  a  mutually  exclusive 
and  collectively  exhaustive  set  of  CoAs  is  known  to  the  decision¬ 
makers,  and  all  they  need  do  is  collect  information  that  informs  the 
perceived  situation.  In  general,  this  is  only  partially  true:  CoAs  can 
also  be  developed  in  response  to  unfolding  events — events  that  may 
not  have  been  perceived  a  priori.  However,  it  is  a  reasonable  assump¬ 
tion  when  representing  expert  decisionmakers  in  stressful  and  time- 
critical  circumstances. 

This  approach  is  consistent  with  the  naturalistic  decisionmaking 
paradigm  of  the  RPD  model,  introduced  by  Gary  Klein  (1989).  Klein 
argues  that  experienced  decisionmakers  store  up  a  set  of  typical  situa¬ 
tions  and  responses  over  time.  They  search  the  environment  for  clues, 
cues,  and  expectancies  that  might  clarify  the  situation.  Once  the 
situation  is  perceived  to  match  one  of  their  stored  situations,  the  deci¬ 
sionmakers  are  then  able  to  respond  accordingly  by  selecting  what 
they  feel  is  an  appropriate  course  of  action — generally  something  that 
has  worked  in  the  past.  However,  if  the  situation  is  not  clarified,  they 
seek  additional  information  or  examine  the  situation  to  determine 
causes  for  the  lack  of  clarity.  This  assessment  could  lead  to  the  modi¬ 
fication  of  a  typical  situation  and  response  or  to  the  creation  of  a 
whole  new  stored  experience.  The  latter  behaviour  results  in  the  crea¬ 
tion  of  new  CoAs. 

Matching  the  current  situation  to  one  of  the  decisionmaker’s 
stored  situations  is  clearly  a  subjective  process.  Each  decisionmaker 
assesses  the  current  values  of  what  are  considered  to  be  critical  infor¬ 
mation  elements  and  decides  whether  the  values  are  ‘close  enough’  to 
one  of  the  stored  situations.  The  choice  of  a  ‘good  enough’  stored 
situation  defines  what  we  refer  to  as  the  decisionmaker’s  comfort  zone. 
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Figure  2.1  illustrates  what  is  going  on.1  In  this  case,  the  commander’s 
conceptual  space  is  described  in  terms  of  two  critical  information 
elements,  ax  and  a2.  The  ground-truth  values  of  these  information 
elements  are  not  known  with  certainty  and  therefore  are  considered 
to  be  random  variables  with  known  densities.  The  ellipses  in  the  dia¬ 
gram  represent  the  decisionmaker’s  comfort  zones  for  each  of  the 
stored  situations.  The  centre  of  each  is  the  desired  value  set,  and  the 
major  and  minor  axes  represent  acceptable  deviations  from  this 
desired  set.  Both  the  centre  and  the  axis  lengths  in  each  direction  are 
fixed.  The  centre  of  the  shaded  ellipse  represents  the  current  estimates 
for  ax  and  a2 ,  and  the  axes  represent  the  uncertainty  in  the  estimate 
based  on  the  covariance  between  the  two. 


Figure  2.1 

Decisionmaker's  Conceptual  Space  and  Stored  Situations 
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1  See  Moffat  (2002)  for  a  more  complete  discussion. 
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In  the  diagram,  we  depict  four  stored  situations,  each  with  its 
degree  of  acceptable  uncertainty  depicted  by  the  size  of  its  ellipse. 
The  shaded  ellipse  is  the  current  estimate,  and  its  size  represents  the 
uncertainty  in  the  estimate.  In  this  case,  although  the  estimate  is  clos¬ 
est  to  situation  5, ,  it  does  not  fall  completely  in  the  comfort  zone. 
The  issue  then  is  to  discern  how  close  the  shape  must  be  to  declare  a 
match.  In  practise,  this  is  a  subjective  process  dependent  in  part  on 
the  decisionmaker’s  attitude  to  risk. 


Estimators 

Through  observations  of  the  battlespace,  sensors  and  other  informa¬ 
tion  sources  generate  estimates  for  the  information  elements  deemed 
critical  to  the  decision.  As  we  discuss  in  the  next  chapter,  the  uncer¬ 
tainty  associated  with  the  information  elements  is  expressed  in  terms 
of  probability  distributions,  the  means  of  which  are  estimates  of  the 
ground-truth  values.  The  quality  of  the  estimates  is  therefore  of  con¬ 
cern  to  us  as  we  assess  the  contribution  of  networking  to  the  quality 
of  the  decisions  taken.  The  mean  of  a  probability  distribution  being  a 
parameter  of  the  distribution,  we  turn  naturally  to  parameter  estima¬ 
tion  theory  to  assess  the  quality  of  the  information  available  to  the 
decisionmaker,  and  we  examine  how  the  quality  of  the  estimates  con¬ 
tribute  to  knowledge.  Mathematical  constructs  from  estimation 
theory  exist  for  the  quality  of  estimates  such  as  accuracy,  bias,  pre¬ 
cision,  sufficiency,  efficiency,  and  consistency.  We  discuss  some  of 
these  terms  more  fully  in  Chapter  Four. 


A  Networked  Decision  Model 

Figure  2.2  depicts  a  simple  network  of  decisionmaking  nodes  that  are 
connected  to  each  other  to  form  a  decision  network.  In  Alberts 
(2001),  the  point  is  made  that  such  a  node-based  network  should 
represent  actors,  decisionmakers  (or  knowledgeable  entities),  and  sen- 
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Figure  2.2 

Network  of  Decisionmaking  Elements 
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Resident  information  source 
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sors  (in  the  most  general  sense  of  information  gatherers).  We  put  the 
focus  here  on  information  gathering  and  decisionmaking.  Each  node 
thus  represents  either  a  ‘local  decisionmaker’ — i.e.,  a  local  com¬ 
mander  who  needs  to  make  decisions,  or  an  information  source — i.e., 
a  collection  facility  such  as  a  sensor,  a  processing  facility  such  as  a 
fusion  centre,  or  a  source  of  information  about  future  plans.  Deci¬ 
sions  are  made  based  on  the  information  available  to  them  either 
locally,  from  collection  assets  and  information  processing  facilities 
elsewhere  in  the  network,  or  from  other  local  decisionmakers  with 
whom  they  are  connected.  The  connectivity  depicted  is  logical  and 
not  necessarily  physical.  The  structure  of  this  network  (the  local 
commanders  represented  and  how  they  link  up  across  a  network)  will 
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be  determined  by  the  way  we  choose  to  organise  the  system  and 
develop  a  plan. 

Information  is  thus  available  from  three  sources:  other  decision 
nodes,  external  information  sources,  and  information  resident  at  the 
decision  node.2  In  this  depiction,  we  are  concerned  solely  with  such 
information  flows. 

Clusters 

In  Figure  2.2,  some  of  the  decision  nodes  are  linked  together  to  form  a 
cluster  that  allows  for  local  sharing  of  information.  The  term  ‘cluster’, 
as  used  here,  refers  to  a  set  of  network  decision  nodes  that  (1)  share 
information,  (2)  agree  on  a  common  set  of  critical  information  ele¬ 
ments,  and  (3)  agree  on  the  current  value  of  the  agreed  critical  infor¬ 
mation  elements  and  degree  of  uncertainty  associated  with  the  cur¬ 
rent  values.  The  term  ‘local’  refers  to  proximity  in  terms  of  logical 
connectivity.  It  does  not  necessarily  imply  physical  nearness.  In  addi¬ 
tion,  we  assume  that  each  of  these  clusters  supports  distributed 
decisionmaking  over  time.  Hence,  we  consider  the  process  to  be 
dynamical. 

Clusters  of  decision  nodes  have  the  following  properties: 

•  Only  decision  nodes  can  be  members  of  a  cluster. 

•  A  cluster  forms  a  complete  graph.  All  decision  nodes  communi¬ 
cate  with  each  other,  thus  producing  n(n  —  1)  connections,  but 
these  connections  are  not  necessarily  physical. 

•  All  decision  nodes  in  a  cluster  are  self-aware.  Each  decision  node 
is  aware  of  its  own  status  and  is  able  to  inform  others  in  the  clus¬ 
ter. 

•  A  cluster  could  consist  of  a  single  decision  node,  a  number  of 
nodes,  or  perhaps  even  all  decision  nodes  in  the  network. 

•  Clusters  may  or  may  not  communicate  with  each  other. 


2  Resident  information  is  sometimes  referred  to  as  ‘organic  information’.  This  expression  is 
the  preferred  term  in  the  Office  of  Force  Transformation’s  network-centric  operations 
framework  (Office  of  Force  Transformation,  Network-Centric  Operations  Conceptual 
Framework,  Version  1.0,  12  April  2004;  available  at  www.oft.osd.mil). 
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•  A  decision  network  consists  of  the  union  of  clusters.  The  total 
network  consists  of  the  decision  network  and  all  supporting 
nondecision  nodes. 

For  example,  in  Figure  2.2  decision  nodes  1  and  2  share  infor¬ 
mation  and  therefore  form  a  cluster.  Decision  nodes  3,  4,  and  5  also 
share  information  and  therefore  form  another  cluster.  Note  that 
although  decision  nodes  2  and  3  may  share  information  with  each 
other,  neither  shares  information  with  the  other  decision  nodes  in  the 
other’s  cluster.  In  the  academic  literature,  ‘small  world  networks’  have 
taken  an  approach  similar  to  this  in  which  highly  clustered  sets  of 
nodes  are  linked  by  longer-range  ‘shortcuts’.  These  types  of  links  lead 
to  desirable  network  properties  such  as  a  high  clustering  coefficient  (a 
measure  of  how  well  the  network  is  linked  locally)  and  a  low  average 
path  length  (a  measure  of  the  mean  number  of  links  between  two 
randomly  chosen  nodes).3 

Partitioning 

We  mentioned  earlier  that  our  goal  is  to  assess  the  degree  to  which 
networked  headquarters  increase  (or  decrease)  the  knowledge  avail¬ 
able  to  the  decisionmakers  and  at  what  cost.  We  stop  short  of  actually 
taking  the  decision  but  rather  measure  success  on  the  premise  that 
more  knowledge  improves  decisionmaking. 

One  way  to  affect  network  knowledge  may  be  to  rearrange  or 
partition  the  network  clusters.  In  Figure  2.2,  for  example,  there  are 
several  possible  partitions,  ranging  from  five  separate  independent 
clusters  of  a  single  decision  node  each  to  one  cluster  consisting  of  all 
five  decision  nodes.4  The  question  therefore  is  how  best  to  partition 
the  network  to  improve  knowledge  at  an  acceptable  cost. 


3  See  Watts  (1999)  and  Albert  and  Barabasi  (2002). 

4  For  a  three-decision  node  network,  the  number  of  partitions  is  five;  for  a  four-decision 
node  network,  it  increases  to  15;  and  for  the  five-decision  node  network,  depicted  in  Figure 
2.2,  the  number  of  possible  partitions  is  49.  The  number  of  partitions  for  n  nodes  is 

P„  =T'k=1S(n,k ), 
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Requirements  for  a  Model  of  the  Process 

We  now  take  up  the  requirements  for  the  general  model  in  more 
detail.  There  are  many  ways  in  which  networks  can  be  evaluated, 
using  a  variety  of  methods  such  as  petri  nets,  Bayesian  networks,  or 
Neural  Nets.  The  approach  chosen  depends  on  the  particular  charac¬ 
teristics  of  the  network  and  the  metrics  that  have  analytic  value.  Fol¬ 
lowing  the  ideas  of  Claude  Shannon,  we  use  information  entropy  as  a 
key  construct  in  developing  metrics — since  we  wish  to  focus  on 
information — and  how  it  is  converted  into  knowledge.  We  also  use 
estimation  theory  to  assess  the  quality  of  the  estimates  of  the  critical 
information  elements  needed  to  take  decisions.  In  addition,  we  wish 
to  capture  the  network  dynamics  of  local  information  sharing,  clus¬ 
tering  in  the  form  defined  above,  local  collaboration,  and  the  costs 
associated  with  complex  network  structures,  since  these  capture  core 
aspects  of  potential  future  headquarters  structures.  It  is  for  these  rea¬ 
sons  that  we  have  adopted  the  method  presented  here. 

Consider  one  of  these  clusters,  i.  Suppose  a  local  decisionmaker 
within  cluster  i  must  take  a  critical  decision  at  time  t.  Estimates  of  the 
information  required  for  the  cluster  to  render  a  decision  is  accumu¬ 
lated  over  time  so  that  if 

Xi(*)=  [*u  (*),"',*«:(*)] 


represents  the  current  estimated  values  for  the  C  cluster- agreed  critical 

information  elements  { ax, . ,ac\  needed  at  time  t,  the  historical 

matrix  of  values  for  the  estimates  of  the  critical  information  elements 
is  represented  by  the  t  x  C  matrix: 


,.(l),x;(2),---,x,(r)]  =  [x,,(y)] 


txC 


wO)  xia{1) 

x  1,1(2) 


*.m(0  *&(*) 


xi,c( 'l) 

xi,c(2) 

xi,c(t)_ 


where  S(n,k )  (also  known  as  the  Stirling  number)  is  the  number  of  partitions  of  n  nodes  into 
k  nonempty  sets  and  S(n,k)=S(n-l,k-l)+kS(n-l,k)  (Jackson  and  Thoro,  1989). 
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Each  element  in  the  matrix  represents  the  perceived  value  (estimates) 
of  the  critical  information  element,  ak ,  at  time  j  for  cluster  i. 

We  wish  to  represent  the  local  decisionmaking  process  within 
cluster  i,  using  ideas  based  on  the  Rapid  Planning  Process.  We  thus 
represent  the  local  conceptual  space  of  the  decisionmakers  within 
cluster  i  by  a  space  spanned  by  a  small  number,  C,  of  information 
elements  that  are  the  key  concerns  of  the  decisionmakers  within  the 
cluster,  as  depicted  for  C  =  2  in  Figure  2.1. 

Framing 

For  the  entire  network,  we  assume  there  is  a  maximum  of  TV  of  these 

critical  information  elements,  ak ,  and  therefore  A  =  {at, . ,aN }  is 

the  global  set  of  critical  information  elements  that  we  shall  refer  to  as 
the  superset  of  critical  information  elements.  Each  of  the  critical  infor¬ 
mation  elements,  ak ,  is  perceived  to  have  the  value  xtk(j)  at  time  step 
j  within  cluster  i. 

Suppose  TV  =  4,  so  that  the  complete  information  set  is 
A  =  {al,a2,a3,a4} .  For  each  cluster,  the  local  conceptual  picture  will  be 
‘framed’  by  selecting  a  subset  of  A.  For  example,  the  local  conceptual 
space  of  a  cluster  might  be  framed  by  the  set  of  elements  Aj  ={al,a2) . 
The  space  of  a  second  cluster  might  be  framed  by  the  set 
A 2  =  {a2,a3,a4} .  Then,  given  that  the  two  clusters  collaborate,  the 
local  collaboration  between  them  results  in  a  shared  conceptual  space 
that  is  framed  by  the  elements 

.A. 1 .A. ^ ^ , a^) a 4^  —  j , 

Shared  Awareness  and  Clustering 

A  cluster  of  decision  nodes  as  defined  earlier  corresponds  to  a  form  of 
shared  awareness  if  the  information  shared  among  the  cluster  nodes  is 
available  and  internalised  at  each  decision  node  in  the  cluster.  By 
‘shared  awareness’,  we  mean  the  ability  of  the  decision  nodes  in  the 


^  In  this  simple  example,  we  have  that  Au  =  A  ;  however,  this  is  not  always  true. 


16  Information  Sharing  Among  Military  Headquarters 


cluster  to  share  realisations  about  the  critical  information  elements. 
We  further  state  that  the  nodes  of  a  cluster  possess  full  shared  aware¬ 
ness  if,  in  addition  to  sharing  the  same  set  of  critical  information  ele¬ 
ments,  they  further  agree  on  the  values  each  of  these  should  take  at  a 
given  time.6 

These  perceived  values,  xik(j),  of  the  critical  information  ele¬ 
ments  form  the  input  data  to  cluster  i  at  time  step  j  as  described  in 
Appendix  A  (The  Rapid  Planning  Process)  at  stage  1  (observation 
analysis  and  parameter  estimation).  Within  cluster  i,  we  assume  there 
are  a  shared  number  of  fixed  patterns  or  stored  situations  in  the 
shared  local  conceptual  space  that  are  the  areas  of  the  space  about 
which  decisionmakers  within  the  cluster  are  particularly  concerned. 
These  are  represented  by  multivariate  normal  probability  distribu¬ 
tions  in  the  conceptual  space  in  the  basic  approach,  as  described  in 
Appendix  A.  However,  when  a  multivariate  normal  representation  is 
not  appropriate,  more  general  methods  must  be  applied,  as  will  be 
discussed  later.  In  either  case,  these  fixed  patterns  are  assumed  to  be 
directly  linked  to  one  of  a  small  set  of  key  courses  of  action  (or  mis¬ 
sions)  from  which  the  local  decisionmakers  within  the  cluster  can 
choose. 


A  Simple  Logistics  Example 

Sustainment  of  deployed  forces  is  one  of  the  more  difficult  logistics 
tasks.  In  this  simple  model,  we  do  not  claim  to  have  examined  all  the 
problems  associated  with  supplying  the  force.  In  fact,  we  explore  a 
single  decision:  allocating  supplies  to  competing  friendly  units.  This 
would  be  part  of  a  sustainment  plan,  and  our  task  is  to  examine  how 
various  decisionmakers  contribute  to  the  plan  across  a  simple  network 
of  information  sharing. 

Figure  2.3  depicts  the  structure  of  a  push  (a)  and  pull  (b)  system 
for  logistics  resupply  from  a  holding  point  to  two  local  commanders. 


6  This  relates  to  the  models  of  situational  awareness  such  as  those  discussed  in  Endsley 
(1995)  and  Feltham,  Sheppard,  and  Cooper  Chapman  (2003). 
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The  allocation  decision  is  made  by  the  master  in  Figure  2.3a  and  the 
arbiter  in  Figure  2.3b. 

In  Figure  2.3a,  the  master  node  decides  which  local  commander 
has  priority,  and  therefore  there  is  no  benefit  to  be  gained  from  the 
two  demand  nodes  collaborating.  As  a  result,  the  demand  nodes 
(local  commanders)  are  considered  sources  of  information  about  their 
own  stock  levels  so  that  the  critical  information  set  required  by  the 
master  is  A  =  {a1,a1} ,  where  the  numerical  subscripts  refer  to  the  sup¬ 
ply  levels  at  the  two  demand  points.  Consequently,  the  network  con¬ 
sists  of  the  decision  node  ‘cluster’  and  the  two  demand  nodes.  Infor¬ 
mation  about  global  stock  levels  is  only  available  at  the  master  node. 

In  Figure  2.3b,  the  arbiter  responds  to  the  demands  from  the 
local  commanders.  All  three  nodes  in  this  case  are  decision  nodes  and 
require  the  same  information  to  make  their  decisions — the  local  sup¬ 
ply  levels  A  =  {a l,a2},  as  in  the  master  case.  The  local  commanders 
place  demands  on  the  arbiter  based  on  their  anticipated  requirements, 
and  the  arbiter  allocates  stocks  based  on  knowledge  of  global  supplies 


Figure  2.3 

Networked  Sustainment  Decisions 


B.  Pull  sustainment 
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and  anticipated  future  needs  at  both  demand  nodes.  The  knowledge 
of  global  stock  levels  is  based  on  shared  information  from  the 
demanding  nodes  and  the  stock  levels  available  to  the  arbiter.  In  this 
case,  we  can  consider  the  benefit  of  the  local  commanders  collabo¬ 
rating  in  order  to  ensure  that  their  demands  are  placed  by  taking 
account  of  global  knowledge  about  stock  levels.  The  network  there¬ 
fore  consists  of  a  single  cluster  that  comprises  the  three  decision 
nodes,  as  depicted  in  the  diagram.7 

In  the  push  case,  no  other  partitions  are  possible  because  there  is 
only  one  decision  node.  In  the  pull  case,  however,  it  is  possible  to 
consider  the  arbiter  and  one  of  the  headquarters  nodes  in  a  single 
collaborating  cluster  and  the  other  a  single  decision  node  cluster. 
Operationally,  we  would  expect  the  arbiter,  in  this  case,  to  give  pri¬ 
ority  to  the  connected  headquarters,  with  the  residual  supply  going  to 
the  single-node  cluster.  It  is  not  possible,  however,  to  partition  the 
two  headquarters  as  a  single  cluster  with  the  arbiter  as  a  single  deci¬ 
sion  node  cluster  because  it  would  violate  the  operational  concept. 
For  this  example,  we  consider  the  single  cluster  in  each  case. 

Each  cluster  supports  local  decisionmaking  within  the  cluster. 
We  can  enrich  the  representation  by  adding  an  information  node  that 
supplies  more  detail  on  the  operational  plan,  the  synchronisation 
matrix  of  the  forces,  and  the  resultant  likely  pattern  of  demand  for 
stocks.  We  focus  here  on  the  demand  for  fuel  supplies  to  make  the 
example  more  concrete.  In  Chapter  Four,  we  will  discuss  the  implica¬ 
tions  of  these  two  modes  of  supply  in  terms  of  information  sharing 
through  collaboration  and  network  knowledge.  Later,  we  will  address 
the  costs  of  achieving  this  level  of  knowledge  as  well. 


7  All  three  agree  that  the  local  and  global  stock  levels  are  the  critical  information  elements, 
they  all  share  information  about  the  value  of  these  critical  information  elements,  and  they  all 
agree  on  these  values. 
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Representing  Uncertainty 


The  decisions  within  each  of  the  clusters  must  be  taken,  in  most 
cases,  without  full  knowledge  of  the  values  of  the  critical  information 
elements  needed  to  support  the  decisions.  The  degree  of  uncertainty 
depends  on  the  information  collection  assets  devoted  to  the  cluster’s 
critical  elements  of  information  and  the  extent  to  which  collaboration 
among  the  cluster  decision  nodes  is  facilitated.  Information  entropy  is 
a  reasonable  estimate  of  the  uncertainty,  and  consequently  we  use  a 
probabilistic  entropy  model  to  represent  the  uncertainty  associated 
with  the  critical  information  elements  needed  within  the  cluster.  The 
reports  on  the  values  of  the  critical  information  elements  are  treated 
as  estimates  of  the  means  of  the  distributions  describing  their  uncer¬ 
tainty,  and  therefore  the  quality  of  the  estimates  is  assessed  using 
concepts  from  estimation  theory. 


Decisions 

The  decisions  taken  within  each  of  the  clusters  depend  on  the  sce¬ 
nario  represented.  In  Figure  2.3,  we  depicted  a  simple  logistics  exam¬ 
ple  in  which  the  decision  is  the  quantity  of  supply  to  allocate  to  each 
demanding  unit.  In  general,  we  focus  on  operational  and  tactical 
decisions  made  at  the  division/brigade,  ship  group,  or  equivalent  level 
and  below.  The  decision  taken  within  the  cluster  depends  on  the  cur¬ 
rent  estimated  values  of  the  critical  information  elements. 
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The  dependency  among  the  critical  information  elements  is 
modelled  using  the  Rapid  Planning  Process,  provided  that  uncer¬ 
tainty  can  be  expressed  in  the  form  of  normal  distributions.  The 
Rapid  Planning  Process  is  a  set  of  algorithms  that  together  represent 
local  command  decisionmaking  at  each  of  the  decision  nodes  (Mof¬ 
fat,  2002).  The  process  requires  that  the  commander’s  local  ‘concep¬ 
tual  space’  be  spanned  by  a  small  number  of  critical  information  ele¬ 
ments.  These  elements  are  a  subset  of  the  total  set,  {a, . . ,aN },  of 

information  elements  considered  across  the  network.  In  the  basic 
formulation,  a  dynamic  linear  model  (DLM;  see  Appendix  A)  is  then 
used  to  represent  the  decisionmaker’s  estimates  of  the  values  of  these 
factors  over  time.  Ideally,  through  this  process,  additional  informa¬ 
tion  from  collection  assets  or  from  collaborating  elements  in  the  net¬ 
work  serves  to  reduce  uncertainty  and  therefore  increase  understand¬ 
ing. 


A  Multivariate  Normal  Model 

We  begin  first  with  a  simple  case  in  which  we  assume  that  the  uncer¬ 
tainty  in  these  critical  information  elements  is  represented  by  a  multi¬ 
variate  normal  distribution,  and  we  are  considering  all  the  informa¬ 
tion  elements  A  =  { ax,---,ac}  shared  across  a  cluster.1  Their  values  are 
represented  by  the  random  vector  X  =  [x1,x2,---,xc]7  .  In  this  case,  the 
DLM  can  be  used  to  make  a  local  assessment  of  the  overall  uncer¬ 
tainty  of  the  vector  of  critical  information  elements  within  the  cluster. 
The  uncertainty  in  the  vector  is  represented  as  the  multivariate  nor¬ 
mal  distribution 


/(*) 


U[x-n]rz  '[x-n] 
e ' 


1  We  will  deal  later  with  the  more  general  case  in  which  this  assumption  need  not  hold. 
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where  p  =  [|j.1,|i,2,"-,|ic]  is  the  mean  and 


is  the  covariance  matrix.  The  off-diagonal  elements  are  the  covariance 
values  between  the  random  variables  x/  and  xj,  calculated  as 
=  E(x-  -p, ■)(*,■  ,  The  value 


P,;=- 


C'°j 


is  the  correlation  between  the  random  variables,  x,  and  Xj.  When 
i  =  j ,  then  E;>/  is  just  the  variance  oj  depicted  along  the  diagonal  in 
the  covariance  matrix.  The  entropy  of  the  distribution  (as  we  will  dis¬ 
cuss  in  more  detail  later)  can  be  easily  calculated  from  the  covariance 
matrix  and  is  then  used  as  the  basis  for  a  quantifiable  metric  of  the 
knowledge  available  to  the  cluster.  With  improved  knowledge,  we  can 
refine  the  estimates  of  the  critical  information  elements  to  more 
closely  reflect  ground  truth. 


Knowledge  from  Entropy 

Decisions  taken  within  a  cluster  depend  on  the  degree  to  which  the 
local  decisionmakers  know  the  true  values  for  each  of  the  critical 
information  elements.  /(X)  represents  the  level  of  uncertainty  associ¬ 
ated  with  the  values  of  the  critical  information  elements.  It  therefore 
forms  the  basis  for  measuring  the  level  of  knowledge.  To  quantify  the 
level  of  knowledge,  we  apply  the  concept  of  information  entropy, 
borrowed  from  information  theory. 

Information  entropy,  sometimes  referred  to  as  Shannon  entropy, 
measures  the  amount  of  information  in  a  probability  distribution 
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(Shannon,  1948).  Shannon  entropy  for  a  probability  density  func¬ 
tion,  /(X) ,  is  defined  to  be  the  expected  value  of  the  negative  loga¬ 
rithm  of  /(X) ,  or 

tf(X)  =  £[-log/(X)]=-J  !•••  J  f{X)\o%f{X)dxc---dx2dxv 

x\  X2  xc 


If,  as  in  this  case,  /(X)  is  continuous,  H(X)  is  referred  to  as  differen¬ 
tial  entropy ? 

For  the  multivariate  normal  distribution,  the  differential  entropy 
is  calculated  to  be 


H  (X)  ■ =  log  (llif  |l|+  f  =  hogf  (2ik)‘ 


(3.1) 


where  |  X  |  is  the  modulus  of  the  determinant  of  the  covariance  matrix 
X  and  C  <  N  is  the  number  of  information  elements  critical  to  the 
cluster. 

In  this  work,  we  are  interested  in  relative  entropy,  and  therefore 
noting  that  H(X)  varies  solely  with  the  covariance  (since  C  is  held 
constant),  we  simplify  equation  (3.1)  to  Hr{X)  =  log  |  X |.  Hr(X)  is 
then  a  local  measure  of  the  (relative)  information  entropy.  From  now 
on,  we  will  drop  the  subscript  r. 

Knowledge 

Knowledge  derived  from  entropy  is  a  quantity,  0<K(X)<1,  that 
reflects  the  degree  to  which  the  local  decisionmakers  within  a  cluster 
know  the  true  values  of  the  information  elements,  {a1,---,ac},  and 
their  interaction.  For  K(X)— >1,  knowledge  is  considered  to  be  good, 
and  for  K(X)  — >  0 ,  it  is  considered  to  be  poor. 

For  the  multivariate  normal  distribution,  K(X)  is  calculated  as 
follows.  We  first  assume  the  existence  of  a  maximum  joint  entropy, 

*UX)  =  logpEL- 


1  See  Appendix  B  for  a  more  detailed  discussion  of  information  entropy. 
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Physically,  this  can  be  interpreted  to  be  the  maximum  uncertainty  in 
the  probability  distribution,  /(X) .  If,  for  example,  the  information 
elements  consist  of  the  x-  and  y-coordinates  associated  with  the  loca¬ 
tion  of  an  enemy  unit,  the  maximum  entropy  might  be  associated 
with  search  area.  Search  areas  are  derivative  of  ‘named  areas  of  inter¬ 
est’  (NAIs)  or  ‘target  areas  of  interest’  (TAIs).  If,  through  the  IPB 
process,  we  are  able  to  describe  a  circular  area  in  an  NAI  or  TAI 
within  which  we  are  virtually  certain  the  enemy  unit  is  located,  we 
can  then  relate  this  information  through  a  circular  error  probable 
(CEP)  to  the  variance  of  the  location  in  the  x-  and  y-directions.3 

If  the  maximum  entropy  is  taken  to  be  log  |  X  |max ,  then  the 
residual  entropy  at  any  given  time  is  log  |  X|max  -//(X) .  A  formulation 
for  K(X)  therefore  ensures  that  a  value  confined  to  the  interval  [0,1] 
is 


/f(x)=i-C^max_//(x^ 

|z| 

M 

I  Imax 

When  the  modulus  of  the  determinant  of  the  covariance  is  close  to  its 
maximum,  knowledge  is  at  a  minimum,  whereas  for  small  values  of 
the  covariance  determinant,  knowledge  is  greatest.4 

The  Effects  of  Knowledge 

As  a  basis  for  our  consideration  of  the  effects  of  knowledge,  we  use 
the  domain  structure  depicted  in  Figure  1.1.  We  are  principally  con¬ 
cerned  with  the  information  and  cognitive  domains.  Information 
derived  from  sensors  or  other  information  gathering  resides  in  the 
information  domain.  It  is  then  transformed  into  awareness  and 
knowledge  in  the  cognitive  domain  and  forms  the  basis  of  decision- 


3  For  unit  location,  if  we  assume  that  g„=g,=g  and  that  gx,=g„  =  0,  then  the  CEP  is 
related  to  the  common  variance  as  follows:  a  =  CEP  / 1 . 1 774 .  CEP  in  this  formula  is  the  radius 
of  the  maximum  search  area.  See  Burington  and  May  (1958). 

4  See  Perry  et  al.  (2002). 
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making.  Our  metrics  quantify  this  process  through  the  use  of  infor¬ 
mation  entropy  and  the  derivative  knowledge  measures.  Information 
sharing  among  nodes  ideally  tends  to  lower  information  entropy  (and 
hence  increases  knowledge)  because  of  the  reduction  in  variance  and 
the  buildup  of  correlations  among  the  critical  information  elements. 

One  of  the  key  aspects  of  increased  knowledge  (and,  corre¬ 
spondingly,  reduced  entropy)  is  thus  an  increased  understanding  of 
the  correlations  between  variables.  This  means  information  can  be 
gained  about  one  critical  information  element  (e.g.,  missile  type) 
from  another  (e.g.,  missile  speed).  Such  cross  coupling  is  a  key  aspect 
for  consideration  as  we  extend  our  analysis  from  normal  to  more  arbi¬ 
trary  probability  distributions. 


More  General  Models 

The  multivariate  normal  assumption  is  likely  to  be  restrictive  for 
some  applications.  The  example  above  in  which  it  was  used  to  repre¬ 
sent  the  location  of  a  target  is  perhaps  the  best-known  military  appli¬ 
cation.  A  more  general  model  for  a  cluster  recognises  that  the  uncer¬ 
tainty  associated  with  each  of  the  critical  information  elements  is 
likely  to  be  represented  by  unique  probability  distributions  and  that 
their  joint  distribution  is  either  unknown  or  can  be  discerned  only 
through  a  laborious  combinatorial  process. 

For  example,  suppose  we  wish  to  model  a  US  carrier  battle 
group  executing  a  cruise  missile  defence  mission  with  its  attached 
Aegis  cruisers.5  Our  cluster  in  this  case  might  consist  of  the  deci¬ 
sionmakers  on  board  each  of  the  Aegis  cruisers  taking  part  in  the 
mission — assuming  that  all  commanders  in  the  cluster  are  able  to 


5  This  is  a  very  real  problem  examined  extensively  by  the  Royal  Navy  and  the  US  Navy.  In 
the  United  States,  the  Cooperative  Engagement  Capability  (CEC)  is  being  developed  in 
response  to  the  challenges  of  littoral  warfare,  the  shrinking  size  of  US  and  Allied  navies,  and 
improvements  to  adversary  capabilities.  CEC  is  an  approach  to  air  defence  that  allows  com¬ 
bat  systems  to  rapidly  share  unfiltered  sensor  measurement  data  and  track  data  to  enable  a 
carrier  battle  group  to  operate  collectively.  See  ‘The  Cooperative  Engagement  Capability’ 
(1995). 
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share  information  with  each  other.  The  decision  to  be  taken  is  when 
and  where  to  engage  an  incoming  enemy  cruise  missile.6  We  further 
assume  that  each  weapon  system  (standard  missiles  on  board  the 
Aegis  cruisers)  requires  the  same  information — the  location  of  the 
target  (latitude  and  longitude),  its  altitude  and  speed,  its  direction, 
and  its  type — so  as  to  have  a  critical  information  set  that  is  uniform 
among  all  decision  nodes  in  the  cluster:7 

A  =  -[location,  altitude,  speed,  direction,  missile  type}  =  [aj  ,a2 ,  ai ,  aA ,  rz5 } . 

These  are  the  information  elements  shared  among  the  cluster,  leading 
to  full  shared  awareness  within  the  cluster.  The  location  of  the  missile 
has  two  components — latitude  and  longitude — and  therefore  we  have 
ai  =\ai,x>ai ,y\-  The  uncertainty  of  these  components  is  taken  to  be  bi¬ 
variate  normal  as  developed  earlier.  As  tempting  as  it  is  to  include 
altitude  in  location  and  model  uncertainty  in  three  dimensions,  we 
recognise  that  altitude  is  bounded  from  below  and  therefore  its  uncer¬ 
tainty  is  better  described  using  a  density  such  as  the  lognormal  or  the 
gamma.8  This  situation  is  also  true  of  speed.  Direction,  however,  is 
circular  and  therefore  bounded  between  0  and  2n .  If  normalised  on 
[0,1],  the  uncertainty  here  can  be  represented  by  a  beta  density.  Mis¬ 
sile  type  is  nominal,  and  therefore  its  distribution  is  empirical. 

Although  more  realistic,  this  representation  is  clearly  more 
problematic.  Added  to  the  complexity  is  the  fact  that  not  all  the 
information  elements  are  independent,  and  therefore  their  joint  dis¬ 
tribution  is  not  likely  to  be  multiplicative.  For  example,  the  speed  of 
a  missile  is,  in  some  part,  a  function  of  its  type — as  is  its  altitude.  Its 


6  We  omit  a  discussion  of  shooting  policy,  centralised  versus  decentralised  command  and 
control,  and  the  participation  of  ground  defence  units.  These  are  all  interesting  aspects  of  the 
problem  and  their  examination  in  a  network-centric  environment  will  lead  to  the  assessment 
of  several  alternative  network  structures  and  command  and  control  arrangements — what  our 
models  are  ultimately  designed  to  accomplish. 

7  In  this  case,  direction  refers  to  the  bearing  of  the  incoming  missile  and  not  its  inclination. 

8  Actually,  it  is  likely  bounded  from  above  as  well,  and  therefore  one  might  argue  for  a  beta 
distribution.  In  either  case,  a  normal  distribution  is  not  appropriate. 
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location  and  direction  at  any  point  in  time,  however  (ignoring  its 
trajectory  history),  need  not  be. 

Two  problems  arise  from  this  more  general  representation:  (1) 
describing  the  joint  probability  distribution,  /(A),  needed  to  account 
for  the  dependencies  within  the  critical  information  elements,  and  (2) 
combining  the  knowledge  functions  for  each  of  the  marginal  distribu¬ 
tions  to  create  an  overall  measure  of  local  knowledge.  We  discuss  two 
methods  for  dealing  with  these  issues:  multi-attribute  assessment  and 
mutual  information. 


Multi-Attribute  Assessment 

The  simplest  (but  perhaps  not  the  most  accurate)  way  to  deal  with 
the  problem  of  combining  the  knowledge  functions  associated  with 
multiple  distributions  is  to  create  a  weighted  sum  that  represents  the 
current  level  of  knowledge  of  the  combined  critical  information  ele¬ 
ments.  Weights  generally  imply  some  notion  of  relative  importance. 
Although  indeed  desirable,  weights  are  not  enough  in  all  cases.  What 
is  needed  is  some  way  to  represent  the  inherent  dependencies  among 
the  information  elements.  Regardless  of  how  well  we  are  able  to 
achieve  this  goal,  a  weighted  sum  is  inherently  flawed  because  of  the 
fact  that  knowledge  need  not  be  additive.  Nevertheless,  as  a  means  of 
comparison,  the  methodology  has  value. 

The  objective  of  multi-attribute  assessment  is  to  derive  a  single 
knowledge  value  that  describes  the  joint  level  of  knowledge  about  the 
critical  information  elements  within  a  cluster  and,  ultimately, 
throughout  the  network.  In  the  multivariate  normal  case,  described 
earlier,  this  value  is  just  the  knowledge  function  derived  from  the  dis¬ 
tribution’s  information  entropy.  By  deriving  this  single  value,  we 
model  the  assessment  of  a  decisionmaker  within  the  cluster,  of  the 
current  estimates  of  the  critical  information  based  on  information  he 
has  received,  and  his  level  of  knowledge  derived  from  these  estimates. 
This,  in  turn,  can  be  used  to  select  a  course  of  action  (take  a  deci¬ 
sion). 
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The  two  methods  discussed  here  derive  from  multiple  attribute 
decisionmaking  (MADM)  theory;  in  particular,  the  MADM  tech¬ 
niques  in  which  the  decisionmaker  is  supplied  with  the  value  of  dif¬ 
ferent  sub-attributes  that  contribute  to  an  overall  value.  Generally, 
MADM  methods  are  used  when  a  decision  must  be  made  between 
two  or  more  alternatives  based  on  multiple  attributes  that  have 
incommensurable  units — for  example,  speed  and  direction.  The 
choice  of  one  technique  over  another  depends  on  the  nature  of  the 
attributes  being  combined  and  their  relation  to  one  another.  Here  we 
discuss  two  methods:  simple  additive  weights  (SAW)  and  weighted 
product.9  In  both  methods,  we  use  the  terms  ‘system’  and  ‘system 
instantiation’  to  refer  to  the  combat  situation  at  the  time  estimates  of 
the  critical  information  elements  are  to  be  assessed. 

Simple  Additive  Weights  Method 

The  SAW  method  (Fishburn,  1967)  is  perhaps  the  simplest  method 
of  aggregation  and  is  a  relatively  old  technique.  It  is  cited  in  Article  I, 
Section  2,  of  the  US  Constitution  as  a  method  to  determine  the 
degree  of  a  state’s  representation  in  the  Union. 10  It  is  generally  used 
when  the  attributes  are  independent  of  each  other.  For  a  case  in 
which  there  are  C  attributes  shared  across  the  cluster,  we  get 

U(A)  =  Xf=1co  iV(ai), 

where  V(A)  is  the  value  of  the  system  instantiation  with  critical 
information  elements,  al .  The  term  co,,  ( (0  ■  =1 )  is  the  weight 
(importance)  of  information  element  at ,  and  V(at)  is  its  value 
(knowledge  function  in  this  case)  for  the  instantiation  being  consid¬ 
ered.  Unfortunately,  the  likelihood  that  all  information  elements 
shared  across  the  cluster  are  independent  is  very  small,  so  this  tech¬ 
nique  is  not  widely  applicable  except  where  the  weights  can  be  made 
to  account  for  the  dependencies  in  some  way. 


9  For  a  complete  discussion  of  several  more  methods,  see  Perry  et  al.  (2002). 

10  See  Yoon  and  Hwang  (1995,  p.  32). 
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Weighted  Product  Method 

The  weighted  product  method  (Bridgman,  1922)  is  similar  to  the 
SAW  technique,  except  in  this  case  the  values  of  the  different  attrib¬ 
utes  are  multiplied.  The  general  form  of  this  method  is 

pApnyqo,)]"' , 


where  V (A) ,  at ,  and  0),  are  as  above. 

Although  V(A)  might  be  used  directly  as  a  measure  of  combined 
system  value,  it  is  often  the  case  that  its  value  in  relation  to  a  positive 
ideal  is  used  instead,  so  we  obtain 


where  A  is  the  positive  ideal  that  may  or  may  not  be  achievable.11 

A  similar  approach  is  the  Keeney-Raiffa  multi-attribute  utility 
method  (de  Neufville,  1990).  In  this  method,  the  aggregation  evalua¬ 
tion  takes  the  form 

QK(A)  +  l  =  nyQG),%.)  +  l], 


where  Q  is  a  normalising  factor  used  to  ensure  consistency  between 
the  definition  of  V (A)  and  the  V (ai ) ’s.  The  value  of  Q  is  given  by 

£2  +  1  =  TI^!  [£2co-  +l] . 

This  technique  is  advantageous  because  it  allows  for  the  consid¬ 
eration  of  possible  interactions  between  the  attributes — something 
clearly  desirable  if  we  wish  to  account  for  dependencies.  For  example, 


11  The  positive  ideal  case,  also  sometimes  referred  to  as  the  most  favourable  case,  is  the 
instantiation  with  the  highest  overall  value.  The  positive  ideal  case  is  selected  from  the 
existing  instantiations,  a  combination  of  the  existing  instantiations  or  using  the  maximum 
possible  value  for  each  attribute. 
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if  C  =  2,  and  {a1,a2}  is  the  set  of  information  elements  shared  across 
the  cluster,  we  get 

V(a)  =  co,  V(a()  +  co27(rz2)  +  Qco1  a>2V  (a^V  , 

with  Q  =  (l-C0j  —  co2 l/fcOj  co2). 

Precedence  Weighting 

Precedence  weighting  provides  a  method  to  get  at  the  dependencies 
among  the  information  elements.  The  weights  are  computed  based 
on  these  dependencies.  The  relative  importance  of  the  information 
elements  is  assessed  singly  or  in  combination.  For  example,  suppose 
we  have  decided  that  the  information  elements,  shared  across  our 
CEC  cluster,  required  to  accurately  engage  an  attacking  cruise  missile 
in  our  example  are  location,  speed,  direction,  and  missile  type.  Recall 
that  the  decision  to  be  made  is  when  and  where  to  launch  a  standard 
missile  to  intercept  the  incoming  enemy  cruise  missile.  For  each  of 
the  information  elements  and  combinations  of  information  elements, 
we  ask:  Can  the  decision  be  taken  with  just  this  (these)  information  ele- 
ment(sf.  For  example,  can  the  decision  to  intercept  be  taken  knowing 
only  the  location  of  the  enemy  cruise  missile?  with  location  and 
speed?  etc.?  For  a  given  set  of  information  elements  of  size  r,  2' ,  such 
questions  must  be  asked.  In  this  example,  this  amounts  to  16  ques¬ 
tions.12  For  large  information  sets,  this  method  could  become  rather 
cumbersome,  hence  the  omission  of  ‘altitude’. 

Other  questions  arise:  If  a  decision  can  be  made  knowing  the 
value  of  only  one  of  the  three  information  elements,  what  added 
value  does  knowing  the  other  two  provide?  Are  the  information  ele¬ 
ments  not  used  in  the  decision  therefore  still  ‘critical’?  First,  we 
assume  that  if  an  information  element  is  designated  as  ‘critical’,  it  is 
needed  to  fully  inform  the  decision.  We  recognise,  however,  that 


12  This  includes  the  empty  set,  i.e.,  no  information  elements  are  available,  and  the  entire  set, 
i.e.,  all  information  elements  are  available.  We  would  expect  the  answer  to  the  former  to  be 
‘no’  and  the  latter  to  be  ‘yes’.  This  is  sometimes  referred  to  as  the  information  element  power 
set. 
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decisions  are  taken  with  less-than-complete  information  but  that 
there  is  some  subset  below  which  a  decision  is  impossible  or 
extremely  risky — regardless  of  the  urgency  of  the  situation.  In  this 
example,  having  estimates  for  all  information  elements  is  better  than 
two  or  one.  One  way  to  acknowledge  this  is  to  assign  weights  to 
various  combinations  of  the  information  elements.  However,  doing 
this  leads  us  back  to  subjective  linear  weighting.  Consequently,  we 
rely  solely  on  counts  for  this  method. 

The  answers  to  the  questions  determine  the  weights  assigned  to 
each  element.  Table  3.1  summarises  the  answers  to  the  16  questions. 

The  next  step  is  to  count  the  number  of  combinations  that 
result  in  a  ‘yes’  response  in  the  last  column  of  the  table  for  each  of  the 
information  elements.  For  example,  of  the  16  combinations  here 
(including  the  empty  set),  location  occurs  in  eight  with  a  ‘yes’ 
response.  In  each  of  these,  it  was  determined  that  a  decision  to  engage 
the  cruise  missile  could  have  been  made  with  just  the  information 
elements  in  the  combination.  For  the  remaining  three,  the  count  is 
five  each. 

Because  location  alone  was  considered  sufficient  for  a  launch 
decision,  any  of  the  other  combinations  that  included  location  were 
also  considered  sufficient.  The  other  three  information  elements 
appear  in  exactly  five  ‘yes’  combinations  because  no  two  combina¬ 
tions  of  them  were  considered  sufficient  to  order  a  launch  but  all 
three  together  were  considered  sufficient.  If  it  were  the  case  that  each 
of  the  four  information  elements  alone  were  sufficient  to  order  a 
launch,  then  each  would  earn  a  score  of  8,  which  is  equivalent  to 
equally  weighted,  independent  information  elements. 

If  all  information  elements  were  necessary  and  no  lesser  combi¬ 
nation  sufficient  to  launch,  we  get  the  same  result.  In  this  case,  each 
information  element  would  receive  a  score  of  1 . 

Calculating  the  relative  weights  from  these  results  consists  of 
using  the  sum  of  the  scores  to  normalise  the  weights  so  that 

=ci/'Z%icj> 
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Table  3.1 

Precedent  Weight  Assessment 


Location  ( ax ) 

Speed  {a5) 

Direction  (a4 ) 

Type  (a^) 

Yes/No 

X 

_ 

— 

_ 

Yes 

— 

X 

— 

— 

No 

— 

— 

X 

— 

No 

— 

— 

— 

X 

No 

X 

X 

— 
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Yes 

X 

— 

X 

— 

Yes 

X 

— 

— 

X 

Yes 

— 

X 

X 

— 

No 

— 

X 

— 

X 

No 

— 

— 

X 

X 

No 

X 

X 

X 

— 

Yes 

X 

— 

X 

X 

Yes 

— 

X 

X 

X 

Yes 

X 

X 

— 

X 

Yes 

X 

X 

X 

X 

Yes 

where  ci  is  the  score  for  information  element  a[ .  In  this  example,  we 
would  get  the  following  weights:  (»!  =  0.348  and  CO,  =  co3  =  (04  =  0.217 . 

This  method  is  practical  only  for  small  sets  of  information  ele¬ 
ments,  since  the  dimension  of  the  problem  increases  exponentially 
with  the  number  of  information  elements.  However,  for  most  opera¬ 
tional  decisions,  the  number  of  critical  information  elements  is  small, 
and  indeed,  we  assume  this  to  be  the  case  in  this  analysis.  Even  for 
those  cases  in  which  the  number  is  large,  it  is  likely  that  certain  com¬ 
binations  are  not  worth  examining  because  it  is  obvious  that  the 
combination  would  not  be  sufficient. 


Mutual  Information 

Next,  we  discuss  a  more  direct  method  to  derive  the  multi-attribute 
knowledge  function  for  a  set  of  information  elements  shared  across  a 
cluster.  Mutual  information  is  derived  from  information  entropy  (see 
Appendix  B)  and  deals  directly  with  the  issue  of  independence  (or 
rather  lack  thereof)  among  the  information  elements.  Although  a 
joint  probability  density  function  is  still  required,  mutual  information 
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allows  us  to  account  for  the  dependencies  even  when  the  joint  distri¬ 
bution  is  empirical. 

What  we  desire  is  a  mathematical  construct  that  will  allow  us  to 
modify  our  knowledge  about  a  random  variable  (information  ele¬ 
ment,  X)  based  on  our  knowledge  of  a  second  random  variable 
(information  element,  Y)  when  X  and  Y  are  not  independent  random 
variables.  Because  one  random  variable  informs  another,  we  refer  to 
this  construct  as  mutual  information. 

Relative  Entropy 

Relative  entropy  measures  the  ‘distance’  between  two  probability 
mass  functions,  denoted  D[p{x)\  q{x)\.  It  is  essentially  the  error 
incurred  by  assuming  the  true  distribution  for  X  is  q(x) ,  when  it  is 
really  p(x) .  Relative  entropy  as  defined  by  Cover  and  Thomas 
(1991)13  is 

D[p{x)\q{x)\=  ^xp{x)\°z^. 


In  this  definition,  we  have 

0  ,  /  \  p{x) 

Olog— —  =  0  and  />(x)log  — —  =  °°. 

qix\  v  7  0 

If  p(x)  =  q(x) ,  then  D[ p{x)  f  q{x)]  =  0  .  However,  relative  entropy  is  not 
a  true  distance  metric  because  it  is  not  commutative.  That  is, 

DW*)lk(*)] = Dk(*)Kx)] 

is  not  always  true.14  Kullback  (1978,  p.  6)  refers  to  the  quantity 


13  See  also  Kullback  (1978). 

14  A  true  metric  satisfies  the  following  properties: 

A  metric  space  is  a  pair  (X,d) ,  where X  is  a  set  and  d  is  a  metric  on  X  (or  a  distance  function  on  X), 
such  that  for  all  x,  y,z  e  X  we  have: 
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d[p{%{x)]+dW)\p{x)_ 


as  a  measure  of  divergence  between  p(x)  and  q{x)  and,  therefore,  a 
measure  of  the  difficulty  of  discriminating  between  them. 


Mutual  Information 

We  use  the  concept  of  relative  entropy  to  arrive  at  a  measure  of 
mutual  information.  Suppose  we  have  two  dependent  random  vari¬ 
ables,  Xand  F,  with  joint  probability  mass  function  p{x,y )  and  mar¬ 
ginal  mass  functions  p{x)  and  p{y) .  We  define  the  mutual  informa¬ 
tion  to  be  the  relative  entropy  between  the  joint  mass  function  and 
the  product  of  the  marginal  mass  functions,  or 


I[X  :  Y )  =  D{p(x,y)\p(x)p(y) 


=  I  I  p(x,y)  log 

xeX  ye  Y 


p{x>y)  • 

p(x)p(y) 


Hence,  I(X :  Y'j  defined  in  this  way  is  the  amount  of  informa¬ 
tion  about  Xgained  from  Y. 


Cruise  Missile  Type  and  Speed 

Recalling  our  example  again  of  the  CEC  cluster,  we  note  that  the 
type  of  enemy  cruise  missile  threatening  a  friendly  fleet  can  be 
inferred  somewhat  from  its  speed  of  approach.  However,  the  relation¬ 
ship  between  the  two  is  not  exact  because  the  missile  may  be  operat¬ 
ing  at  a  speed  other  than  its  nominal  speed  and  several  of  the  missiles 
may  operate  at  similar  speeds.  Nevertheless,  if  a  report  of  missile 


d  is  real-valued,  finite  and  nonnegative. 
d(x,y)  =  0  if  and  only  if  x  =  y. 
d(x,y)  =  d(y,x) . 
d{x,y )  <  d(x,z)+d(z,y) . 

(Taken  from  Kreyszig,  1978.) 
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speed  is  received,  it  is  possible  to  improve  our  knowledge  about  the 
type  of  missile  threatening  us. 

For  example,  suppose  we  let  the  random  variables  S  and  M 
represent  the  speed  and  enemy  missile  type,  respectively.  We  define 
the  joint  probability  mass  function,  p(s,m),  in  Table  3.2.15  Three 
missile  types  are  listed  as  column  headers.  Continuous  speed  has  been 
divided  into  four  Mach  intervals  and  are  listed  in  the  left-hand  col¬ 
umn.  The  entries  in  the  table  are  the  joint  probability  mass  for  the 
events  st  U m-  or  p(st,m-).  The  marginal  distributions  p(st )  and 
p{m-)  are  the  probability  that  a  missile  is  travelling  within  the  range 
si  and  that  the  missile  launched  is  of  type  mj ,  respectively. 

From  this  we  calculate  the  mutual  information: 


!(M  ■■S)=^j=JJAi=lp(si,mj)\og 


p(si)p(mj) 


0.222. 


Table  3.2 

Joint  Probability  Mass  Function  for  Speed  and  Missile  Type 


C601 

( ) 

C801 

(m2) 

SS-N-27 

(m3) 

p(si) 

0-M0.75 

Uj) 

0.05 

0.04 

0.20 

0.29 

M0.75-1.0 

(s2) 

0.14 

0.15 

0.02 

0.31 

Ml. 0-2.0 

(r3) 

0.03 

0.05 

0.07 

0.15 

>M2.0 

U4) 

0.04 

0.01 

0.20 

0.25 

p{m}) 

0.26 

0.25 

0.49 

1.00 

NOTE:  The  speeds  are  given  in  Mach  units. 


*5  Although  it  is  always  possible  to  create  such  a  table,  it  is  generally  very  difficult  to  obtain 
credible  table  entries.  In  most  cases,  a  normalisation  process  is  needed  to  convert  whole  num¬ 
bers  (generally  from  1  to  10)  supplied  by  operators  to  the  joint  probabilities.  In  any  case,  the 
entries  are  more  likely  to  be  subjective  estimates. 
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Therefore,  the  amount  of  information  about  cruise  missile  type 
that  can  be  gained  from  the  speed  of  the  incoming  enemy  missile  is 
0.222  nats.16  Because  I(M  :S)  =  I(S :  M),  we  may  also  interpret  this 
to  be  the  amount  of  information  about  the  speed  of  the  incoming 
enemy  cruise  missile  that  can  be  gained  from  its  type. 

One  way  to  use  mutual  information  is  to  develop  pairwise  joint 
probability  mass  functions  for  all  the  critical  information  elements 
and  calculate  their  mutual  information.  A  high  mutual  information 
score  between  two  information  elements  means  that  the  two  are  non- 
randomly  associated  with  each  other,  whereas  a  lower  score  signifies 
that  the  two  are  independent — that  is,  that  the  joint  distribution  of 
the  two  holds  no  more  information  than  the  information  elements 
considered  separately.  Butte  and  Kohane  (1999)  use  this  pairwise 
assessment  of  mutual  information  to  associate  genes.  They  hypothe¬ 
sise  that  an  association  with  high  mutual  information  means  that  one 
gene  is  nonrandomly  associated  with  another.  They  then  select  a 
threshold  mutual  information  level  and  use  the  associations  above  the 
threshold  to  generate  gene  clusters  or  relevance  networks.17 

The  next,  and  more  difficult,  step  is  to  develop  the  appropriate 
weights  from  these  pairwise  associations.  We  have  not  addressed  this 
problem  as  yet;  however,  it  appears  that  the  relevance  network 
approach  suggested  by  Butte  and  Kohane  might  be  applicable. 

Assuming  a  joint  probability  mass  function  can  be  found  for  all 
the  information  elements  shared  across  a  cluster,  we  can  proceed  as 
follows. 

Entropy  and  Mutual  Information 

The  knowledge  function  used  to  assess  understanding  relies  on  the 
calculation  of  information  entropy.  Consequently,  it  would  be  help- 


16  When  information  entropy  is  calculated  using  base  2  logarithms,  the  resulting  measure  of 
information  present  in  the  distribution  is  a  ‘nit’.  When  we  use  natural  logarithms,  the  mea¬ 
sure  is  the  ‘nat’,  with  reference  to  the  natural  logarithm.  See  Appendix  B  for  a  fuller  dis¬ 
cussion. 

17  Two  genes  are  connected  in  the  network,  if  their  mutual  information  scores  exceed  the 
threshold  for  that  cluster. 
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ful  to  exploit  the  relationship  between  mutual  information  and 
information  entropy.  Fortunately,  the  relationship  is  quite  straight¬ 
forward.  First,  however,  we  need  to  develop  the  concept  of  condi¬ 
tional  entropy. 

Conditional  entropies  H(X  \  Y)  and  H(Y  \  X)  are  sometimes 
referred  to  as  ‘side  information’,  i.e.,  the  uncertainty  (entropy)  in  one 
random  variable  is  conditioned  on  another  random  variable.18  If  the 
random  variables  X  and  Fhave  a  joint  probability  density,  p{x,y),  the 
conditional  entropy  H(X  |  Y)  is  defined  as 

H[X\Y)  =  - I  p(x)  X  p{y\x)\ogp{y\x) 

xeX  ■ yeY 

=  -  I  I  p{x,y)\ogp[y\x) 

xeX  yeY 


From  this,  we  can  derive  an  expression  for  mutual  information  in 
terms  of  information  entropy  as  follows: 


I(X:Y)=  X  X  p{x,y)\og 

xe  X  ye  Y 

=  I  X  p(x,y)  log 


xeX  yeY 


p{x>y ) 

p(x)p{y) 

p{x  I  y) 
p{x) 


xeX  yeY 


=  -I  X  I  X  p{x,y)\ogp[x\y) 

xeX  yeY 

=  -X  p(x)\ogp[x)~ 

xeX 

=  H(X)-H(X\Y). 


-X  Y  p{x,y)\ogp(x\y) 

xe  X  yeY 


18  In  communications  theory,  the  conditional  entropies  can  be  thought  of  in  terms  of  a 
communications  channel  with  input  X  and  output  Y  H(X  \  Y)  is  then  called  the  equivocation 
and  corresponds  to  the  uncertainty  in  the  transmission  from  the  receiver’s  point  of  view,  and 
H(Y  |  X)  is  called  the  prevarication  and  represents  the  uncertainty  from  the  transmitter’s 
point  of  view.  Taken  from  Blahut  (1987). 
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This  is  helpful  because  all  quantities  can  be  expressed  in  terms  of 
information  entropy. 

The  next  step  is  to  consider  the  multidimensional  case.  That  is, 
how  is  the  uncertainty  in  the  perceived  values  of  the  critical  cluster 
information  elements  {av---,ac}  affected  by  knowledge  of  the  value  of 
information  element  yi  By  a  simple  extension  of  the  relationship 
developed  for  two  information  elements,  we  get 

l(Xl,X2,-Xc:Y)  =  H(Xl,X2,-Xc)-H(X1,X2,-Xc\Y). 

Assuming,  of  course,  that  the  joint  and  conditional  probabilities 
can  be  defined,  this  gives  us  a  closed-form  value  for  joint  entropy. 

Another  way  to  get  at  this  value  is  to  use  conditional  entropy. 
For  the  case  in  which  all  information  elements  are  independent,  the 
joint  entropy  calculation  is  additive  so  that 

H{Xx,X2,..Xc)  =  ^%H(Xt). 


This  is,  in  effect,  an  upper  bound  on  joint  entropy  so  that,  in  general, 
H{XvX2,-Xc)<^lxH{Xt). 

However,  if  the  conditional  entropies  can  be  calculated,  joint  entropy 
can  be  calculated  directly  as 

//(AI,A2,-Ac)  =  lf=1//(A,|A,_I,-A1). 


Summing  Up 

The  degree  of  uncertainty  in  a  cluster  depends  on  the  information 
collection  assets  devoted  to  the  cluster’s  critical  elements  of  informa¬ 
tion  and  the  extent  to  which  collaboration  among  the  cluster  decision 
nodes  is  facilitated.  We  apply  a  probabilistic  entropy  model  to  repre¬ 
sent  the  uncertainty  associated  with  the  critical  information  elements 
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needed  within  the  cluster.  The  reports  from  sensors  or  other  informa¬ 
tion-gathering  sources  are  treated  as  estimates  of  the  means  of  the  dis¬ 
tributions  describing  their  uncertainty. 

These  estimates  are  transformed  into  awareness  and  knowledge 
and  form  the  basis  of  decisionmaking.  The  metrics  we  have  developed 
in  this  chapter  quantify  this  process  through  the  use  of  information 
entropy  and  the  derivative  knowledge  metrics. 

Information  sharing  among  nodes  ideally  tends  to  lower  infor¬ 
mation  entropy  (and  hence  increases  knowledge)  because  of  the 
reduction  in  variance  and  the  buildup  of  correlations  among  the  criti¬ 
cal  information  elements.  One  of  the  key  aspects  of  increased  knowl¬ 
edge  is  increased  understanding. 

A  key  requisite  for  calculating  cluster  (and  eventually  network) 
knowledge  is  an  acceptable  method  for  combining  knowledge  gained 
from  all  critical  information  elements  at  a  single  headquarters,  how 
that  combination  produces  cluster  knowledge,  and  how  cluster 
knowledge  combines  to  produce  network  knowledge.  We  have  sug¬ 
gested  several  combining  techniques,  several  of  which  require  knowl¬ 
edge  of  a  joint  probability  distribution.  In  many  cases,  the  joint  prob¬ 
ability  distribution  of  all  critical  information  elements  is  not  known 
and  is  difficult,  if  not  impossible,  to  calculate  empirically. 


CHAPTER  FOUR 


The  Effects  of  Collaboration 


Networks  provide  an  opportunity  for  participating  decision  entities 
to  collaborate  by  sharing  information  as  they  form  clusters.  This  is 
generally  thought  to  be  a  good  thing,  as  we  have  seen  so  far.  By  shar¬ 
ing,  we  experience  synergistic  effects  that  improve  what  we  know  (the 
completeness  of  our  information)  and  how  accurately  we  know  it,  as 
measured  in  terms  of  its  bias  and  precision.  In  other  words,  collabora¬ 
tion  improves  both  the  quantity  and  quality  of  the  information  we 
need  to  take  decisions.  As  compelling  as  this  argument  is,  there  is  also 
a  possible  negative  aspect  of  collaboration  and  information  sharing: 
the  expenditure  of  resources  needed  to  deal  with  information  over¬ 
load  and  disconfirming  evidence.  The  former  can  lead  to  processing 
delays  and  the  latter  to  indecision  as  contradictory  information  is 
resolved.  We  treat  these  in  more  detail  in  Chapter  Five.  Here  the 
focus  is  on  the  role  of  collaboration  across  a  cluster  in  producing 
information  that  is  complete  and  accurate. 


Knowledge 

As  discussed  earlier,  information  entropy  appears  to  be  a  good  surro¬ 
gate  for  assessing  the  knowledge  level  within  a  cluster  about  a  given 
critical  information  element.  Until  now,  we  have  focused  only  on 
knowledge  derived  from  the  entropy  associated  with  the  probability 
distribution  for  the  uncertain  information  elements.  As  noted  earlier, 
the  entropy  function  is  always  a  function  of  the  distribution  variance, 


39 
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and  therefore  our  knowledge  function  is  a  function  of  precision  only; 
that  is,  it  measures  the  degree  to  which  the  observations  of  the  critical 
information  element  are  ‘close  together’.  To  assess  the  degree  to 
which  the  networked  headquarters  affect  decisionmaking,  our  meas¬ 
ure  of  knowledge  must  also  include  the  completeness  and  the  bias  of 
the  estimates.  We  thus  begin  by  examining  the  components  of  accu¬ 
racy,  namely,  bias  and  precision. 

Bias 

Bias  in  an  estimate  is  error  introduced  by  systematic  distortions.  For 
example,  suppose  we  were  to  conduct  an  experiment  in  which  the 
temperatures  of  some  substance  had  to  be  measured  over  time.  If  the 
thermometer  we  used  were  calibrated  such  that  every  reading  was  off 
by  1  degree  Celsius,  the  result  would  be  a  set  of  biased  measurements. 
Bias  therefore  is  systemic,  not  random,  error. 

An  unbiased  estimator  therefore  is  one  for  which  E[p]  =  p .  That 
is,  the  expected  value  of  the  estimate  of  the  parameter,  |1,  is  the  true 
value  of  the  parameter,  p .  Thus,  the  bias  in  the  estimate  is  the  degree 
to  which  this  is  not  true,  or  £=|E[p]-p|. 

Precision 

The  variation  in  estimates  of  the  critical  information  elements  can 
occur  in  a  purely  random  way.  For  example,  an  observer  may  make 
an  error  in  judgment  such  as  reporting  a  tracked  personnel  carrier  to 
be  a  tank.  The  variation  may  also  be  the  result  of  uncontrollable  envi¬ 
ronmental  conditions,  such  as  weather  patterns,  that  cause  sensor 
occlusions.  In  any  event,  random  errors  of  this  kind  affect  the  preci¬ 
sion  of  the  estimates  reported  because  they  increase  the  variance  of 
the  distribution  of  the  estimated  information  element.  In  general, 
precision  is  defined  to  be  the  degree  to  which  estimates  of  the  critical 
information  element(s)  are  close  together.1  Bias  and  precision,  there- 


1  This  is  a  commonly  accepted  definition.  Ayyub  and  McCuen  (1997,  p.  191)  define  preci¬ 
sion  as  ‘the  ability  of  an  estimator  to  provide  repeated  estimates  that  are  very  close  together’. 
A  similar  definition  can  be  found  in  Pecht  (1995). 
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fore,  are  independent.  That  is,  biased  estimates  may  or  may  not  be 
precise. 

Precision  and  Entropy 

In  Appendix  A,  we  describe  the  Rapid  Planning  Process  by  way  of  a 
simple  operational  example  based  on  a  local  perceived  force  ratio.2 
For  a  given  cluster  i,  at  intermediate  time  steps  j,  we  need  only  pursue 
the  process  as  far  as  assessing  the  probability  of  each  fixed  pattern 
within  the  conceptual  space  of  a  local  decisionmaker  within  the  clus¬ 
ter.  (These  are  the  stored  situations  depicted  in  Figure  2.1.)  The 
estimate  produced  is  declared  to  be  one  of  the  stored  situations,  pro¬ 
vided  the  estimate  falls  within  the  decisionmaker’s  comfort  zone. 

The  joint  probability  density  /(x.(y)),  a  multivariate  normal 
distribution  with  covariance  matrix  X,  reflects  the  uncertainty  asso¬ 
ciated  with  the  critical  information  elements  shared  across  cluster  i  at 
time  step  j,  where  =  2(j),-",xiC(j)],  the  vector  of  per¬ 

ceived  values  of  the  critical  information  elements  in  the  shared  con¬ 
ceptual  space,  assuming  all  C  elements  are  available  to  the  cluster. 
This  is  the  shaded  area  labelled  ‘current  estimate’  in  Figure  2.1 
(Chapter  Two). 

The  mean  of  the  current  estimate,  ,  reflects  the  current 

best  estimate,  based  on  reports  received  from  organic  sources  and 
information  shared  across  the  cluster,  and  the  covariance,  E,  reflects 
the  precision  of  this  estimate.  The  amount  of  information  available  in 
the  joint  (multivariate  normal)  probability  density  is  measured  in 
terms  of  the  relative  information  entropy,  //(x.(y))  =  log  |  E| .  Both 
precision  and  information  entropy  are  a  function  of  the  covariance. 


2  The  perceived  force  ratio  is  calculated  from  the  recognised  picture,  generally  defined  by  a 
number  of  attributes  (elements  of  information).  A  detailed  description  of  both  the  recog¬ 
nized  picture  and  the  perceived  force  ratio  can  be  found  in  Chapter  2  of  Moffat  (2002). 
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Estimating  Local  Knowledge 

Local  knowledge  is  a  measure  of  understanding  within  cluster  2. 3  As 
demonstrated  earlier,  there  is  an  inverse  relation  between  entropy  and 
knowledge  based  on  precision  alone:  As  entropy  increases,  knowledge 
decreases.  In  general,  we  get  the  following  knowledge  metric  for  a 
joint  distribution  of  the  information  elements,  which  is  multivariate 
normal: 4 


=  1- 


2 

2 

max 


where  log  |  2  |max  is  the  maximum  relevant  information  entropy  and 
I  2  |max  is  the  determinant  of  the  corresponding  covariance  matrix. 
A(x;(y))  reflects  the  level  of  understanding  within  the  cluster  based 
on  precision  alone. 

Precision  and  Knowledge  in  the  Logistics  Example 

To  illustrate,  we  continue  with  the  logistics  example  from  Chapter 
Two.  In  Figure  2.3a,  when  there  is  no  collaboration  among  the 
nodes,  each  is  monitoring  its  requirement  for  fuel  and  providing 
estimates  to  the  single  master  decision  node.  This  configuration  pro¬ 
duces  a  single  cluster  comprising  only  the  master  node.  When  the 
nodes  are  collaborating,  as  in  Figure  2.3b,  information  is  shared 
between  the  two  nodes,  and  therefore  we  take  them  to  be  dependent. 
The  network  in  this  case  is  a  single  cluster  consisting  of  the  two 
demand  nodes  and  the  arbiter  node. 


3  By  understanding,  we  mean  the  ability  of  humans  to  draw  inferences  about  the  possible 
consequences  of  a  situation.  Clearly,  knowledge  enhances  this  ability  and  therefore  can  be 
considered  a  measure  of  understanding. 

4  Actually,  the  exact  entropy  value  for  the  bivariate  normal  case  is  H(x,y)  =  log |  (2jw)2Z|. 
However,  because  we  are  concerned  about  the  ratio  of  entropies,  we  use  the  simpler  version. 
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For  simplicity,  we  start  by  assuming  that  the  fuel  requirement  is 
normally  distributed.5  Consequently,  we  let  ax  be  the  information 
element  ‘fuel  demand  for  node  1’  and  a2  be  ‘fuel  demand  at  node  2’. 
The  fuel  supply  at  the  master  node  is  assumed  to  be  known  with  cer¬ 
tainty;  that  is,  the  master  node  is  self-aware  in  both  cases.  Conse¬ 
quently,  the  critical  information  element  set  is  A  =  {al,a1\ ,  and  the 
value  of  each  is  depicted  as  x  =  [xl,x2\r  ■  In  each  case,  the  distribution 
of  uncertainty  about  the  information  element  is  normal  with  mean 
(X  =  [|i  i  ,M-2  F  and  covariance 

Pl,2®l®2 

Pl,20l02  °2 

The  fuel  levels  at  each  of  the  demanding  nodes  may  be  corre¬ 
lated,  and  the  effects  of  collaboration  are  therefore  dependent  on  the 
correlation  coefficient,  -l<p12  <1.6  If,  as  in  Figure  2.3a,  the  nodes 
are  not  collaborating,  the  ‘network’  consists  of  the  master  node  and 
the  two  isolated  demand  nodes,  and  we  model  the  lack  of  collabora¬ 
tion  by  setting  p12  =0.  That  is,  we  assume  that  each  headquarters  is 
acting  independently,  and  therefore  all  their  demands  for  fuel  are 
independent.  The  off-diagonal  elements  in  the  covariance  matrix 
then  are  0.  This  is  clearly  the  simplest  case  to  analyse  because  the 
implications  of  ‘no  collaboration’  are  clear  in  that  it  produces  un¬ 
correlated  fuel  levels.  In  cases  like  Figure  2.3b,  where  collaboration 
between  the  nodes  takes  place  and  pI  2  ^ 0,  collaboration  can  be 
shown  to  have  a  salutary  effect. 

In  general,  total  cluster  information  entropy  is 

//(x)  =  log|z|  =  log[a?o’(l  ■ -Pi,2)]  • 


5  This  is  valid  only  when  the  mean  demand  is  large  and  the  variance  is  suitably  small. 

6  Although  the  fuel  is  received  from  the  same  source,  the  demand  may  be  generated  inde¬ 
pendently  and  therefore  may  be  uncorrelated. 
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In  the  non-collaboration  case,  this  reduces  to  H(x)  =  log(CiG2). 
Because  entropy  measures  the  degree  of  uncertainty  in  probability 
distributions,  small  values  of  //(x)  are  desirable.  Regardless  of  the 
values  of  the  variances  Gj  and  o2,  this  will  occur  when  |  p12  |  is  near 
1.0.  Conversely,  maximum  entropy  and,  therefore,  maximum  uncer¬ 
tainty  occur  for  p12=0.  The  change  in  entropy  from  the  non¬ 
collaboration  case  and  the  collaboration  case  then  is 

Ha  (x)  “■ Hb  (x)  =  — log(! -| Pu )  • 


As  before,  a  value  of  |  p12  |  close  to  1.0  maximises  this  quantity. 

To  convert  entropy  to  knowledge,  we  first  need  to  establish  a 
maximum  entropy  value,7  which  is  equivalent  to  establishing  a 
maximum  variance  or  determinant  of  the  covariance  matrix  as  devel¬ 
oped  in  Chapter  Three.  Because  the  maximum  covariance  occurs 
when  the  random  variables  are  uncorrelated,  we  have 

H  (a,,a~,\  =  losiGt  G,  |. 

max  \^1  7  2  )  o l  1, max  2, max  f 


We  can  now  develop  the  knowledge  metric  to  measure  the  cur¬ 
rent  level  of  understanding  of  the  fuel  demand  for  both  the  collabora¬ 
tion  (a)  and  non-collaboration  ( b )  cases.  We  have  for  the  non¬ 
collaboration  case  that 
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7  This  is  necessary  because  entropy  for  continuous  random  variables  (referred  to  as  differen¬ 
tial  entropy,  see  Appendix  B)  is  always  unbounded. 
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For  the  collaboration  case,  we  have 
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The  benefits  of  collaboration  therefore  are  measured  as  the  difference 
between  the  two,  or  the  increase  in  understanding  represented  by 


A/f(x)  =  Ah(x)-/f;(x) 


_  2  _2  _2 
Pl,2<7l<72 

2  2 
^1,  max  ^2,  max 


Here  the  buildup  in  correlation  between  the  information  elements  ax 
and  a2  causes  the  increase  in  knowledge.  This  relates  to  our  com- 
monsense  understanding  of  an  increase  in  knowledge  of  our  sur¬ 
roundings  because  we  know  how  one  thing  relates  to  another. 


Accuracy 

Accuracy  is  the  degree  to  which  the  estimates  of  the  critical  informa¬ 
tion  elements  are  close  to  ground  truth.  Collaboration  across  a  cluster 
affects  the  accuracy  of  the  estimates  of  the  information  elements — 
and  hence  the  degree  to  which  fixed  patterns  in  the  shared  conceptual 
space  are  indeed  ground  truth.  The  concept  of  accuracy  comprises 
both  precision  and  bias:  The  smaller  the  bias,  the  closer  the  estimate 
is  to  ground  truth,  and  the  more  precise  the  estimates  (i.e.,  the  more 
closely  they  are  grouped),  the  more  confident  we  are  in  the  estimate. 

We  generally  take  ground  truth  to  be  the  ideal  distribution 
mean  for  the  information  and  measure  the  bias  of  the  estimates  in 
terms  of  the  distance  from  the  ground-truth  value.  This  assumes,  of 
course,  that  we  know  ground  truth,  which  is  always  the  case  in  mod¬ 
els  and  simulations  aimed  at  assessing  and  comparing  alternative 
C4ISR  (command,  control,  communications,  computers,  intelli- 
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gence,  surveillance,  and  reconnaissance)  systems,  alternative  network 
constructs,  or  alternative  operating  procedures.  Support  to  actual 
operations  in  which  ground  truth  is  not  known  requires  an  assess¬ 
ment  of  the  consistency  of  the  information  reported.  In  some  cases,  we 
may  instead  take  as  our  point  of  comparison  the  estimates  coming 
from  the  higher-level  planning  process.  For  our  fuel  logistics  illus¬ 
tration,  we  can,  for  example,  compare  the  perception  of  the  fuel 
demand  from  each  of  the  two  nodes,  with  the  assessment  made  from 
the  top-down  planning  assumptions. 

In  general,  if  a  is  an  information  element  whose  value,  x,  is  un¬ 
known  with  probability  distribution  f(x)  and  mean  p  representing 
ground  truth,  then  the  bias  associated  with  the  estimate  of  the  mean 
is  b=\  E(p)- p  |,  where  (1  is  the  estimate  of  the  mean  based  on  one  or 
more  reports  on  the  value  of  a.  Because  accuracy  consists  of  both  bias 
and  precision,  we  need  a  metric  that  combines  both.  One  such  metric 
is  the  mean  square  error  (MSE),  defined  as  E[(p-p)2].  It  can  be 
shown  that  E[(p-p)2]  =  £2  +c2,  where  o2  is  the  variance  of  p.8  This 
metric  is  extremely  useful  because  it  includes  both  accuracy  in  the 
total  and  precision  as  a  component.  In  estimating  ground  truth,  the 
bias  accounts  for  nonrandom  errors  and  the  precision  accounts  for 
random  errors. 

To  illustrate  this,  in  our  CEC  cluster  example,  suppose  we  want 
to  estimate  the  location  of  an  enemy  cruise  missile  based  on  several 
sequentially  arriving  reports  from  the  collaborating  team.  Each  report 
is  processed  in  turn  using  Bayesian  updating  to  refine  the  location 
estimate.9  In  this  case,  we  need  an  estimate  for  both  the  x-coordinate 
and  the  y-coordinate.  The  bivariate  normal  distribution  is  used  to 
represent  the  uncertainty  associated  with  the  random  location  vector, 
x  =  [x,yY  .  The  estimator  in  this  case  is  a  Bayesian  estimator  of  the 
form: 


8  See,  for  example,  Cover  and  Thomas  (1991). 

9  Later  in  this  chapter,  we  suggest  maintaining  the  incoming  reports  and  variance  estimates 
until  a  decision  is  imminent.  If  we  perform  the  updates  sequentially  at  that  time,  we  can 
account  for  the  age  of  the  reports — essentially  discounting  older  reports. 
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In  this  formulation,  |i  is  the  collaborating  team’s  current  estimate  of 
(X  =  [|l.v,|J.J,]r.  The  instrument  (sensor,  source,  process)  used  to  obtain 
the  estimate  has  an  associated  error,  which  we  record  as  X,  an  esti¬ 
mate  of  the  variance.  This  may  be  acquired  from  the  target  location 
error  associated  with  sensor  or  source  and  existing  environmental 
conditions  prevailing  when  the  measurement  was  made.10  This 
matrix  serves  as  a  weight.  For  large  X,  the  expression  2  (j  +  2)_1  is 
very  small  (close  to  the  zero  matrix)  and  £(x  +  X)-1  is  approximately 
the  identity.  Therefore,  the  current  report  has  little  influence  on  |Xf+I . 
The  reverse  is  true  for  small  X.  The  initiating  estimates,  |X0  and  X0, 
are  obtained  from  the  IPB  process  or  are  estimates  prevailing  at  the 
last  decision  point. 

The  task  now  is  to  assess  just  how  accurate  the  estimate  is.  If  we 
are  conducting  a  controlled  experiment,  such  that  the  true  location  of 
the  unit  is  known,  then,  as  mentioned  above,  we  can  take  advantage 
of  the  fact  and  calculate  the  bias  in  the  estimate.  In  this  case,  the  bias 
is  the  Euclidean  distance  between  the  Bayesian  estimate  and  the 
ground-truth  location  of  the  unit,  or 


10  It  is  also  possible  to  use  the  sample  mean  of  several  reports  as  an  estimate  of  the  latest  of 
several  reports,  the  ‘best’  report,  etc.  Each  will  require  an  accompanying  estimate  of  the  vari¬ 
ance. 
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For  our  purposes,  we  can  take  this  estimate  to  be  the  true  location  of 
the  enemy  cruise  missile  at  a  specific  moment  in  time.  By  analogy 
with  the  MSE,  the  accuracy  of  the  estimate  is  defined  as 

D{*)  =  b2  + 


Accuracy  in  the  Logistics  Example 

Recall  that  the  amount  of  fuel  required  at  each  node  is  x  =  [xpX2]r. 
We  assume  the  x  is  bivariate  normal  with  mean  p  =  [p1,p2]r.  Reports 
on  projected  fuel  requirements  are  processed  sequentially  to  create  a 
current  estimate  of  future  requirements  for  both  nodes.  As  in  the 
location  estimate  discussed  above,  the  estimator  is  Bayesian  and  the 
bias  is  the  Euclidean  distance  between  the  estimates  and  ground 
truth.  However,  unlike  the  location  example  above,  the  error  associ¬ 
ated  with  each  report  is  generated  from  two  sources.  In  the  first  case 
(no  collaboration),  the  errors  are  independent,  and  in  the  second  case 
(collaboration)  they  are  not.  We  also  assume  that  a  report  is  received 
from  both  nodes  near-simultaneously. 

The  estimate  covariance  matrices  depend  on  the  model  selected 
and  the  update  methodology.  For  example,  in  the  no-collaboration 
case  (Figure  2.3a),  the  sample  covariance  matrix  is 


When  they  are  collaborating,  as  in  Figure  2.3b,  the  sample  covariance 
matrix  is 


Pi, 2  C1 
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Generating  these  estimates  may  be  problematic.  The  sources  of  the 
reports  are  generally  the  headquarters  themselves,  so  the  errors  are 
associated  with  both  the  assessment  of  fuel  on  hand  and  future 
requirements.  Standards  may  exist  for  predicting  fuel  requirements 
that  vary  with  a  unit’s  posture.  In  any  event,  it  will  be  necessary  to 
provide  error  estimates  in  both  cases. 

The  bias  is  the  Euclidean  distance,  as  previously  discussed,  so 

that 
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At  any  time,  the  estimates  of  the  covariance  for  both  cases  are 
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and  therefore  the  accuracy  metrics  for  the  two  cases  are  (iteration  sub¬ 
scripts  omitted) 

Da(ypj  =  b2  +o\o 2  and  Dh{^  =  b2 +<51l(511-p1a1lC21. 

Consequently,  the  increase  in  accuracy  in  the  collaboration  case  is 
p2o2lol.  Again,  this  quantity  is  maximised  when  |p|  is  close  to  1.0. 
The  task  now  is  to  measure  these  effects  on  cluster  and  network 
knowledge. 
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The  Effects  of  Bias,  Precision,  and  Accuracy  on 
Knowledge 

One  way  to  account  for  bias,  precision,  and,  hence,  accuracy  in  the 
knowledge  function  is  to  replace  the  distribution  variance  with  the 
MSE,  or  the  accuracy  measure,  Z)(x),  in  the  knowledge  function. 
Doing  so  has  the  effect  of  increasing  the  variance  to  account  for  bias. 
The  MSE  is  bounded  from  below  by  the  variance,  so  when  the  bias  is 
0,  the  MSE  is  just  the  variance.  In  the  general  case,  we  express  knowl¬ 
edge  as 

K„(X)  = 


where  KM(X)  is  the  knowledge  function  with  the  variance  replaced 
by  the  MSE.  To  do  this,  we  calculate  the  maximum  and  current 
entropies  in  the  usual  way  and  then  replace  the  variance  (or  more 
generally,  the  covariance)  with  the  MSE. 

For  the  multivariate  normal  case,  for  example,  we  get  a  modified 
knowledge  function  of  the  form: 


b2  +  |Z 


The  ‘maximum’  MSE  is  a  combination  of  the  maximum  bias  and  the 
maximum  precision  and  represents  the  maximum  in  inaccuracy. 
Because  bias  and  precision  are  independent,  the  maximum  occurs 
when  both  are  maximised,  or  (^2+|S|)max=^ax+|E|max.  Like  the  vari¬ 
ance,  a  suitable  upper  bound  for  bias  can  be  found  by  searching  for 
the  largest  possible  measurement  error  the  sensors  or  sources  might 
produce. 

We  can  apply  this  to  the  simple  logistics  problem.  For  the  non¬ 
collaboration  case,  we  get 
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For  the  collaboration  case,  we  have 

/  2  .  —  2—2  —2—2—2 
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The  maximum  MSE  is  the  same  for  both  cases,  given  that  the  covari¬ 
ance  is  maximised  when  p  =  0  and  the  variances  are  fixed.  The  differ¬ 
ence  between  the  two  now  reflects  the  effects  of  collaboration  on 
knowledge  as  measured  by  precision,  accuracy,  and  bias  and  is  calcu¬ 
lated  to  be 
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As  expected,  this  quantity  is  diminished  over  the  previously  calculated 
values  that  considered  precision  only.  However,  if  the  estimate  is  un¬ 
biased  (b  =  0),  the  results  are  the  same.  Also,  in  the  rare  case  that  an 
estimate  is  reported  as  ground  truth  (no  variance),  bias  is  still  possible 
so  that  there  is  no  improvement  in  knowledge  from  the  non¬ 
collaboration  to  the  collaboration  case. 

We  next  discuss  the  contribution  to  information  sharing  of  the 
completeness  of  the  information  available  to  take  a  decision.  The 
combining  of  precision,  accuracy,  bias,  and  completeness  then  will 
measure  the  overall  contribution  of  collaboration  across  the  cluster  to 
knowledge  and  thus  to  improved  local  decisionmaking. 


Completeness 

For  any  cluster  i,  we  have  defined  the  complete  data  set  at  time  t  as 
the  set  x.i(j)  =  [xil(j),xit2(j),-",Xi>c(j)].  The  set  consists  of  a  maxi¬ 
mum  of  C  elements  of  critical  information;  however,  only  a  subset 
consisting  of  n  <  C  out  of  C  elements  might  be  available  at  time  t.  If 
waiting  for  additional  reports  is  not  possible,  a  decisionmaker  would 
be  required  to  take  a  decision  without  the  benefit  of  complete  infor¬ 
mation.  Depending  on  his  experience  and  other  contextual  informa- 
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tion,  he  may  be  able  to  infer  some  likely  less  reliable  value  for  the 
missing  information.  For  now,  we  assume  that  if  the  value  of  an  in¬ 
formation  element  is  missing,  the  value  of  completeness  for  cluster  i  is 


where  \  is  a  ‘shaping’  factor  that  reflects  the  decisionmaker’s  aversion 
to  risk  because  the  selection  of  the  appropriate  value  depends  on  the 
consequences,  as  perceived  by  the  decisionmaker,  of  being  forced  to 
take  a  decision  with  incomplete  information.  For  values  of  £<1,  the 
curve  is  concaved  downwards,  thus  reflecting  a  high  aversion  to  risk; 
for  ^>1,  it  is  concaved  upwards,  reflecting  little  aversion  to  risk;  and 
for  ^  =  1,  it  is  a  straight  line,  reflecting  the  decisionmaker’s  equivoca¬ 
tion  concerning  risk.  The  ultimate  impact  of  this  lack  of  complete¬ 
ness  is  the  uncertainty  of  the  decisionmaker’s  perception  of  where  he 
is  in  the  conceptual  space,  as  depicted  in  Figure  2.1.  The  selection  of 
the  appropriate  values  depends  on  the  consequences  associated  with 
being  forced  to  take  a  decision  with  incomplete  information. 

With  the  addition  of  completeness,  we  are  now  ready  to  com¬ 
bine  the  measures  of  collaboration,  namely  accuracy  (i.e.,  precision 
and  bias)  and  completeness,  to  produce  a  single  knowledge  metric 
that  can  be  subsequently  combined  with  the  measures  of  complexity 
discussed  in  Chapter  Five.  But  before  we  develop  the  combined  col¬ 
laboration  metric,  we  must  first  address  another  measure  of  informa¬ 
tion  quality:  its  currency.  It  is  generally  assumed  that  more  recent,  or 
fresher,  information  is  desirable  over  older  information.  This  supposi¬ 
tion  is  certainly  true  in  the  modern  battlespace,  where  events  change 
rapidly. 
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Information  'Ageing' 

The  information  gathered  by  a  cluster  consists  of  reports  concerning 
one  or  more  of  the  critical  information  elements  shared  across  the 
cluster,  which  are  necessary  to  take  a  decision.11  These  reports  are 
used  to  update  the  joint  probability  distribution  of  uncertainty  con¬ 
cerning  these  information  elements.  If  the  reports  are  old,  we  assume 
that  their  contribution  to  reducing  uncertainty  is  less  than  if  they  are 
fresh.  In  addition,  if  resources  within  a  cluster  are  such  that  reports 
arriving  are  not  processed  in  a  reasonable  period  of  time,  they  will  age 
in  a  queue  with  the  same  effect. 

Freshness  is  a  consideration  that  is  separate  from  timeliness. 
Freshness  is  concerned  with  how  old  the  information  is  and,  as  such, 
is  generally  context  free.  Timeliness,  however,  deals  with  when  the 
information  is  needed  and,  as  such,  is  situation  dependent.  Both 
timeliness  and  freshness  are  functions  of  the  time  volatility  of  infor¬ 
mation,  i.e.,  the  rate  at  which  information  is  likely  to  change  over 
time.  For  example,  consider  assessing  the  location  of  a  missile  versus 
the  location  of  a  mountain.  Information  about  the  location  of  a 
mountain  is  considered  time  resilient,  and  therefore  freshness  and 
timeliness  are  essentially  equivalent.  However,  we  take  the  position 
here  that  the  older  the  information  is,  the  lower  its  quality. 

Precision,  bias,  and,  hence,  accuracy  depend  on  the  estimator 
selected  (a  Bayesian  estimator  in  this  case)  to  estimate  fixed  patterns 
of  ground  truth  shared  across  a  cluster.  They  are  also  dependent  on 
the  joint  probability  density  function  that  reflects  the  uncertainty  in 
our  knowledge.  Consequently,  what  is  needed  is  a  methodology  that 
allows  us  to  incorporate  the  age  of  the  reports  in  our  updating  pro¬ 
cess. 

Time  Lapse 

For  each  critical  information  element,  atj,  shared  across  cluster  i  at 
time  period  j,  we  record  the  time  that  its  estimated  value,  v2j7  ,  was 


11  In  Chapter  Five,  we  address  the  issue  of  unneeded  information  and  its  contribution  to 
‘information  overload’. 
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last  reported  and  shared  across  cluster  i,  ti,j.  If  a  decision  within 
cluster  i  is  to  take  place  at  time  t, ,  then  FUj  =  t,  —  is  a  measure  of 
the  freshness  of  information  element  at  the  time  it  is  used.  We  can 
further  express  the  importance  of  freshness  by  an  exponent  so  that  we 
get 


FU  = 


where  the  parameter  r|  >  0  reflects  the  degree  to  which  freshness  of  a 
report  concerning  information  element  ai  at  time  period  j  is  an 
important  consideration  in  taking  a  decision,  i.e.,  the  time  volatility 
of  the  information.  For  example,  the  freshness  of  information  con¬ 
cerning  the  location  of  the  Baath  Party’s  headquarters  in  An 
Nasiriyah  is  not  as  critical  as  a  report  on  the  location  of  Fedayeen 
Saddam  forces  in  the  city. 

To  be  consistent  with  other  metrics,  we  choose  to  normalise  FU] 
as  follows: 


T| 


where  t0  is  the  time  at  which  the  data  collection  begins  in  cluster  i  for 
this  decision.  In  the  case  of  the  Baath  Party  headquarters,  a  value  of 
r\>\  would  be  appropriate.  In  the  case  of  the  Fedayeen,  we  would 
place  considerable  importance  on  fresh  information  and  therefore 
assign  a  value  of  q  <  1 . 

Updating 

Within  the  time  required  to  take  a  decision  within  cluster  i,  several 
reports  from  sensors  and  sources  of  the  estimated  value  of  altJ  are 
likely  to  be  produced — each  with  time-lapse  estimate  ®,,; ,  calculated 
as  above.  By  updating  the  value  of  the  information  element  over 
time,  we  can  also  account  for  the  age  of  the  data  reported.  In  this 
way,  we  directly  affect  the  information  and  therefore  its  knowledge 


O  = 


t:  ~t 


'■J 
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function.  As  mentioned  earlier,  we  have  elected  to  update  our  esti¬ 
mates  using  Bayesian  updating. 

We  assume  that  at  some  time  t,  U  sequentially  arriving  reports 
concerning  information  elements,  {(fy.cfy hQj^Gjh-dAt/’&u)}  are  to 
be  combined  to  support  a  decision  to  be  made  at  time  t.u  The  pairs, 
(|l*,cfy)  are  the  £th  sequential  estimate  of  the  mean  and  variance  of 
the  distribution  describing  the  uncertainty  in  the  value  of  the 
information  element  a.  The  scalar  versions  of  equations  (4.1)  and 
(4.2),  uncorrected  for  freshness,  are 


m+i 


TilVolfh 

_2  .  "2 
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The  pair  (p0,(fy)  are  the  estimates  existing  at  time  t0 .  This  could  be 
the  IPB  estimate,  or  it  could  be  the  estimate  at  the  last  decision.  Next, 
we  modify  the  estimates  to  account  for  the  freshness  of  the  reports. 

We  assume  that  the  effect  of  ageing  makes  the  estimate  less  cer¬ 
tain.  Ageing  therefore  is  a  function  of  the  estimated  variance  coupled 
with  the  normalised  freshness  factor,  cfy.  For  the  more  recent  reports, 
is  small,  and  for  older  reports,  it  is  large.  In  any  case,  0  <  <  1 , 

which  suggests  a  net  present  value  model  for  measuring  the  effect  of 
on  the  variance  of  the  estimate;  that  is,  we  replace  the  variance 
with  (l-l-O^)d^  .13  This  yields  the  following  modified  Bayesian  update 


12  We  drop  the  cluster  and  time  period  subscripts  for  clarity.  It  should  be  clear  that  the 
information  element  is  required  at  cluster  i  and  that  the  time  period  at  which  the  combining 
takes  place  is  j. 

13  The  net  present  value,  P,  of  a  principal  amount,  A,  compounded  over  n  time  periods  is 
P  =  A(\  +  /')”,  where  /  is  the  rate  of  return.  The  argument  for  an  analogous  approach  in  this 
case  is  that  freshness  can  be  thought  of  as  the  rate  at  which  the  variance  increases. 
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formulas  for  producing  the  current  estimate  for  the  value  of  the 
information  element  a: 

_<n[V(i+o,)a^ 

Uk+X~  aj  +  (1  +  ®*)aj 

and 

2  al(l+Q,)g^ 

°w"a2+(l+04)ar 

In  the  best  case,  when  the  freshness  factor  is  0  (the  report  arrives  at 
decision  time),  there  is  no  effect  on  the  reported  variance.  In  the 
worst  case,  when  the  report  dates  to  the  beginning  of  collection  for 
the  current  decision,  the  reported  variance  is  doubled. 

The  final  estimate,  p;/,  calculated  in  this  way  is  taken  to  be  the 
estimate  of  the  true  mean  of  the  distribution  with  bias,  and  variance 
estimate,  ofr.  The  updated  density  function  is  therefore 
f (x:\iyyOfj).  From  this  we  can  calculate  a  current,  updated  knowl¬ 
edge  estimate,  Km(x).u 


Measuring  the  Overall  Effect  of  Cluster  Collaboration 

Finally,  we  combine  the  currency  adjusted  precision  and  accuracy 
knowledge  function  with  completeness  to  arrive  at  a  single  metric  to 
assess  the  effects  of  collaboration  across  the  cluster.  The  ideal  case  is 
when  we  have  full  completeness,  i.e.,  Xt(n)  =  Xt(C)  =  1,  and  the 
knowledge  shared  across  the  cluster  is  fully  accurate,  i.e.,  KM(x)  =  l, 
for  the  multivariate  case.  In  this  case,  collaboration  is  able  to  provide 
complete  information,  and  its  accuracy  provides  the  local 


14  In  the  special  case  of  a  multivariate  normal  distribution  of  uncertainty  across  the  infor¬ 
mation  elements,  this  effect  can  be  put  in  place  by  adjusting  the  initial  values  of  the  ‘obser¬ 
vation  noise’  and  ‘system  noise’  in  the  DLM  (as  discussed  in  West  and  Harrison,  1997). 
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decisionmaker  with  perfect  knowledge  or  situational  awareness. 
Unfortunately,  this  ideal  is  seldom,  if  ever,  achieved.  Consequently, 
we  require  a  construct  that  gauges  the  degree  to  which  accuracy,  as 
calculated  here,  and  completeness  contribute  to  knowledge. 

The  knowledge  function,  KM(n),  is  derived  by  replacing  the 
variance  in  the  entropy  calculation  with  the  MSE,  thus  allowing  us  to 
account  for  both  precision  and  bias.  It  is  logical  that  we  proceed  in 
the  same  way  with  completeness;  that  is,  we  replace  the  MSE  with  a 
function  of  the  MSE  and  completeness.  In  general,  when  Xt{n)  is 
small,  (i.e.,  when  there  exists  estimates  for  only  a  small  fraction  of  the 
required  number  of  information  elements),  the  knowledge  function 
should  also  be  small,  all  things  being  equal,  because  this  means  that 
the  aggregate  accuracy  of  the  estimates  is  based  on  only  a  few  ele¬ 
ments  of  information.  One  way  to  reflect  this  behaviour  is  to  replace 
the  MSE  in  the  entropy  calculation  with 

*.(») ' 

This  calculation  has  the  desirable  property  that  when  Xt{n)  — >  1.0,  the 
ratio  is  just  the  MSE,  and  that  when  Xt{n)  — >0,  it  increases  without 
bound.  This  indeed  reflects  the  fact  that  if  we  have  no  information, 
we  have  no  knowledge  and  thus  the  bias  and  variance  estimates  are 
irrelevant.  However,  it  is  not  practical  to  use  this  calculation  as  a 
lower  bound,  since  it  will  drive  the  ratio 


^  Although  we  illustrate  the  discussion  with  the  univariate  case,  this  applies  equally  to  the 
multivariate  case. 
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to  increase  without  bound  for  all  values  of  n.  To  avoid  this,  we  arbi¬ 
trarily  select  n  =  1  to  be  the  worse  case  with  Xt{\ )  =  C_S.16  Conse¬ 
quently,  the  upper  bound  on  the  resultant  entropy  calculation  is 

fyiax~*~^max  ,  _2  \ 

^  y  max  '  max  J  * 

This  has  the  effect  of  increasing  the  maximum  MSE  when  the  requi¬ 
site  number  of  information  elements  is  large.  Note  that  if  C  =  1 ,  there 
is  no  effect  on  the  current  entropy  calculation  or  on  the  maximum 
entropy.  If  we  let  Kk(x)  be  the  knowledge  within  the  cluster  based  on 
accuracy  and  completeness,  then 

kk(x)  = 

where  HK  (x)  is  the  entropy  calculated  with  the  maximum  variance 
replaced  with  and  HK(x)  is  the  current  entropy  calcu¬ 

lated  with  the  current  variance  replaced  with 


Applying  this  to  the  normal  case,  we  get  the  knowledge  gained 
when  completeness  is  accounted  for  as 


Kk(x)  =  \- 


b~  +  cr 


;(b2  +a2  ) 

\  max  max ) 


Knowledge  increases  when  the  values  of  more  of  the  requisite  infor¬ 
mation  elements  have  been  reported  and  is  maximised  when  n  =  C. 
This  simple  formulation  is  intuitively  satisfying  because  we  would 
expect  that  for  all  the  precision  and  accuracy,  unless  information  on 
all  the  information  elements  is  present,  our  knowledge  will  be  defi¬ 
cient.  This  scales  naturally  to  the  multivariate  normal  case  as 


16  We  discuss  the  special  case  of  C  =  1  later. 
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*k(x)  =  1- 
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Up  to  this  point,  we  have  captured  the  effects  of  collaboration 
among  decision  nodes  within  a  cluster  on  knowledge.  The  measured 
effects  of  information  sharing  through  collaboration  are  accuracy  and 
completeness.  For  the  most  part,  these  effects  are  dynamical,  since 
they  vary  with  the  quality  and  quantity  of  reports  received  and  pro¬ 
cessed  over  time.  Missing  from  this  analysis  so  far  has  been  an  assess¬ 
ment  of  the  systemic  effects  of  the  network  architecture,  effects  that 
are  more  static.  In  the  next  chapter,  we  take  up  such  measures  of 
network  complexity  and  combine  them  with  the  collaborative  effects 
to  arrive  at  a  single  measure  of  network  performance  and  its  effect  on 
decisionmaking. 


CHAPTER  FIVE 


The  Effects  of  Complexity 


In  the  previous  chapter,  we  were  concerned  about  measuring  the 
effects  of  collaborative  decisionmaking  among  the  decision  nodes 
within  a  cluster.  Although  the  ability  to  collect,  process,  and  share 
information  is  dependent  on  the  structure  of  the  supporting  network, 
we  focused  our  assessment  on  the  dynamics  of  operations:  the  effects 
of  processed  and  shared  information  over  time.  In  this  chapter,  we 
focus  on  the  network  itself  and  its  ability  to  enable  efficient  and  effec¬ 
tive  information  flow.  Our  measure  is  complexity,  and  we  examine 
both  the  detrimental  effects  of  overly  complex  networks  and  the  salu¬ 
tary  effects  of  complexity.1 


Complex  Networks 

All  networks  are  complex  to  a  greater  or  lesser  degree,  including  mili¬ 
tary  command  and  control  systems  operating  in  a  network-centric 
environment.  The  challenge  is  to  understand  the  nature  of  complex¬ 
ity,  what  its  effects  are,  and  how  to  quantify  them.  All  three  tasks 
have  been  attacked  in  the  past  (we  briefly  summarise  a  history  below); 
however,  a  satisfactory  resolution  is  still  elusive.  One  thing  is  certain, 


1  Much  of  the  discussion  on  complexity  in  this  chapter  is  taken  from  an  unpublished  RAND 
report:  W.  Perry,  F.  Bowden,  J.  Bracken,  R.  Button,  J.  McEver,  and  T.  Sullivan,  Advanced 
Metrics  for  N etiuork-Centric  Naval  Operations ,  December  2002  (J.  McEver  contributed  the 
work  on  complex  systems  in  the  referenced  report.) 
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though:  There  are  both  good  and  bad  effects  of  complexity.  For  this 
reason,  we  have  adopted  Murray  Gell-Mann’s  more  neutral  term  plec- 
ticity  to  describe  the  effects  of  the  network  infrastructure  on  military 
operations.  This  characterisation  avoids  the  negative  aspects  of  the 
term  ‘complexity’.2 

What  Is  Complexity? 

Complex  networks  (such  as  the  World  Wide  Web,  which  operates  on 
another  complex  network,  the  Internet),  have  been  studied  for  years 
in  attempts  to  understand  their  structure  and  properties.  The  science 
of  complex  adaptive  systems,  too,  has  evolved  in  less  than  two  dec¬ 
ades  as  an  interdisciplinary  attempt  to  understand  how  components, 
when  tied  together  in  certain  ways,  yield  systems  with  capabilities  dif¬ 
ferent  from  those  of  their  constituent  components  taken  separately.3 
Still,  although  general  agreement  exists  on  what,  broadly,  complexity 
is,  there  are  no  agreed-on  definitions  of  complexity,  much  less  quanti¬ 
tative  measures  of  complexity  in  networks. 

For  decades,  researchers  have  recognised  that  as  systems  grow 
and  become  more  complicated,  their  behaviour  departs  substantially 
from  that  of  the  system’s  components  (Anderson,  1972).  In  1965, 
Kolmogorov  proposed  a  useful  definition  of  complexity:  ‘The  com¬ 
plexity  of  an  object  is  the  shortest  binary  computer  program  that 
describes  the  object’  (Kolmogorov,  1965).  It  can  be  shown  that, 
defined  in  this  way,  complexity  is  approximately  equivalent  to  Shan¬ 
non  entropy,  a  well-defined  mathematical  construct  described  earlier 
(Shannon,  1948).  Shannon  entropy,  as  a  surrogate  for  complexity,  is 
used  in  medical  research  to  assess  the  complexity  of  biological  sys¬ 
tems.  Other  definitions,  similar  in  spirit  to  the  Kolmogorov  com- 


2  Gell-Mann  (1995/1996)  argues  that  the  study  of  complex  adaptive  systems  is  better 
referred  to  as  plectics,  because  it  is  ‘a  broad,  transdisciplinary  subject  covering  many  aspects 
of  simplicity  and  complexity  as  well  as  the  properties  of  complex  adaptive  systems,  including 
composite  complex  adaptive  systems  consisting  of  many  adaptive  agents’.  Gell-Mann  derives 
the  word  ‘plectics’  from  the  Greek  work  plektos,  which  can  refer  to  both  simplicity  and  com¬ 
plexity.  Invocation  of  the  word  plectics  allows  for  the  study  of  entanglement  or  the  lack 
thereof. 

3  See,  for  example,  Moffat  (2003). 
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plexity,  have  been  proposed,  including  Gell-Mann’s  effective  complex¬ 
ity ,  defined  as  the  length  of  the  description  of  the  regularities,  or  the 
‘grammar’  of  a  system,  and  Bennett’s  logical  depth,  which  defines 
complexity  as  the  processing  time  theoretically  required  for  a  com¬ 
puter  to  go  from  the  description  of  a  system  to  the  ability  to  duplicate 
the  system  itself  (Gell-Mann,  1995). 

In  addition  to  these  attempts  at  defining  complexity,  some 
quantitative  definitions  of  complexity  aimed  at  calculating  the  com¬ 
plexity  of  specific  systems  have  been  proposed.  Again,  no  consensus 
definitions  have  emerged  from  the  literature,  which  has  the  flavour  of 
a  spirited  debate  among  many  camps.  Wolpert  and  Macready  (1997) 
propose  a  quantification  of  how  the  spatio-temporal  patterns  of  dif¬ 
ferent  scales  of  a  system  differ  from  one  another  (‘self-dissimilarity’) 
as  a  signature  of  system  complexity.  Sporns  and  Tononi  (2002) 
describe  a  method  they  and  Edelman  developed  to  measure  the  com¬ 
plexity  of  the  brain  by  relating  functional  segregation  and  integration 
measures.  Sole  and  Luque  (2002)  discuss  and  refine  a  proposed  sto¬ 
chastic-based  complexity  measure  of  nonlinear  physical  systems, 
based  on  the  system  entropy,  the  number  of  states  to  which  the  sys¬ 
tem  has  access,  and  a  measure  of  the  interaction  between  the  com¬ 
ponents  of  the  system.  Other  quantifications  of  complexity  exist  as 
well,  and  ultimately  we  too  present  a  complexity  metric  in  this  work, 
specifically  for  a  decision  network,  such  as  that  proposed  above,  that 
can  be  applied  to  evaluate  alternative  network  clustering  structures. 

Even  though  this  literature  has  yielded  useful  insight  into  the 
problem  of  defining  network  complexity  and  understanding  what 
features  of  a  complex  network  result  in  the  effectiveness  and  adapt¬ 
ability  properties  we  desire,  direct  application  of  these  complexity 
definitions  has  proven  difficult.  This  is  particularly  true  in  the  context 
of  a  command  and  control  network  in  which  the  network  compo¬ 
nents  themselves  are  complex  and  adaptive  and,  specifically,  do  not 
have  simple  rules  for  how  they  interact  with  their  network  neighbours 
(as  is  assumed  in  models  of  complex  physical  systems).  Instead,  by 
combining  insight  from  physics,  medicine,  and  neural  network 
approaches  to  complexity  measurement  with  an  understanding  of 
network  topology  and  desired  decision  network  features,  we  can 
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move  towards  defining  a  metric  for  evaluating  various  network  clus¬ 
tering  possibilities. 

Plecticity 

In  this  context,  plecticity  refers  to  the  ability  of  a  connected  set  of 
actors  to  act  synergistically  via  the  connectivity  between  them.  This 
measure  is,  in  effect,  the  value  added  to  the  capability  of  the  system 
by  the  entanglements  between  the  system’s  nodes  (decision  nodes  in 
this  work).  It  is  intended  to  take  into  account  the  fact  that  there  may 
be  constraints  on  how  nodes  can  constructively  connect  to  other 
nodes,  because  of  either  technical  or  procedural  limitations.  That  is,  a 
node’s  connectivity  can  add  costs  as  well  as  benefits  to  network  per¬ 
formance.  Thus,  networks  can  gain  value  both  from  the  entangle¬ 
ments  that  are  present  and  from  those  that  are  not.  A  measure  of 
plecticity  should  account  for  the  value  of  the  nodes’  ability  to  glean 
information  from  throughout  the  network  to  fulfil  its  particular  func¬ 
tions,  include  a  means  for  measuring  the  value  of  network  redun¬ 
dancy,  and  reflect  a  cost  to  network  effectiveness  if  nodes  are  over¬ 
whelmed. 

Command  and  control  networks  that  do  well  with  regard  to 
these  measure  attributes  should  be  able  to  more  readily  enable  the 
acquisition  of  timely  information  and  facilitate  a  decisionmaker’s 
more  effective  use  of  information  resources  gleaned  from  the  network 
for  the  performance  of  mission  functions. 


Accessing  Information 

A  decision  network  must  provide  the  decisionmakers  in  a  cluster  the 
ability  to  gain  easy  access  to  information  needed  to  support  deci¬ 
sionmaking.  Whether  the  information  is  ‘pushed’,  as  from  sensors 
and  sources,  or  ‘pulled’,  as  with  queries  over  the  Internet,  the  degree 
to  which  the  information  is  accessible  is  an  important  measure  of  a 
network’s  effectiveness.  Because  accessibility  is  closely  related  to  the 
completeness  of  information,  we  begin  the  development  of  the  acces- 
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sibility  metric  from  the  completeness  metric  developed  in  the  previ¬ 
ous  chapter. 

The  metric  developed  earlier  for  the  completeness  of  the  infor¬ 
mation  set  shared  across  the  cluster  is  simply  a  ratio  of  counts:  [avail¬ 
able  required  information  elements]  to  [total  required  information 
elements].  Therefore,  no  attempt  was  made  to  assess  the  degree  to 
which  we  can  really  expect  to  receive  the  information  element,  i.e.,  the 
degree  to  which  the  network  allows  the  cluster  to  access  information 
in  the  network,  or  information  accessibility.  A  metric  that  does  this  is 
the  ratio  of  [the  aggregate  expected  degree  of  critical  information 
access]  to  [the  total  number  of  information  elements  across  the  net¬ 
work]  .  Such  a  metric  accounts  for  the  uncertainties  associated  with 
retrieving  needed  information.  For  our  CEC  cluster  example,  in 
maintaining  an  enemy  missile  track,  the  ‘distance’  required  informa¬ 
tion  must  travel  from  source  to  destination  might  be  used  to  assess 
the  strength  of  the  connectivity  between  the  source  and  the  destina¬ 
tion  for  a  given  information  element.4  If  we  calculate  the  connectivity , 
k/,  for  information  element  at  in  such  a  way  that  0  <  kt  <  1 ,  we  arrive 
at  a  connectivity  value  k<n,  with  the  equality  holding  only  when  the 
distance  is  negligible  and  the  connectivity  is  robust.  As  before,  n  is  the 
number  of  critical  information  elements  for  which  at  least  one  report 
has  been  made  available.  In  this  case,  k  =  f(k,).5  Although  not 
technically  a  probability,  connectivity  calculated  in  this  way  does 
reflect  the  uncertainties  associated  with  moving  information  through 
a  network. 

Another  way  to  look  at  it  is  in  terms  of  transmission  costs.  Re¬ 
placing  the  binary  accounting  for  information  elements  as  was  done 
in  the  completeness  score,  with  a  connectivity  score  based  on  a  dis¬ 
tance  function  of  this  sort,  recognises  the  cost  imposed  by  the  path 
the  information  must  take  through  the  network  to  arrive  at  the  clus¬ 
ter  requiring  it.  That  is,  if,  for  a  given  network  configuration,  a  speci- 


4  Distance  in  this  context  refers  not  only  to  the  physical  separation  between  source  and  des¬ 
tination,  but  may  also  include  other  factors  such  as  the  time  required  to  move  information. 

5  It  is  understood  that  the  information  element  is  critical  to  node  i  at  time  t.  However,  for 
ease  of  exposition,  we  omit  these  two  subscripts. 
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fled  type  of  information  follows  an  ‘expensive’  path  in  getting  from 
its  source  to  the  cluster  requiring  it,  that  source’s  contribution  to  the 
supply  of  information  to  the  cluster  takes  a  value  lower  than  one  that 
is  less  expensive.  Consequently,  the  accessibility  is  diminished. 

Distance  and  Connectivity 

The  distance  function  can  be  something  as  simple  as  the  number  of 
links  in  the  path  from  source  to  sink.  A  more  complicated  function 
might  take  into  account  the  individual  capabilities  of  each  link  and 
node  in  the  path.  Because  both  nodes  and  links  comprise  a  path’s 
length,  the  more  realistic  assessment  considers  both.  For  now,  we 
defer  the  mathematical  construct  of  the  distance  function  and  focus 
on  its  use  in  constructing  a  connectivity  metric.  For  any  cluster 
information  element,  at ,  we  are  interested  in  the  shortest  path  from 
source  node  to  destination  node,  d/>  1 ,  however  calculated.6  The 
quantity,  dt ,  represents  the  expense  incurred  by  moving  information 
element  at  from  source  to  destination.  This  value  is  now  used  to  cal¬ 
culate  the  connectivity  value,  kt,  for  information  element  at  as  fol¬ 
lows: 


where  (0/  >  1  is  the  rate  at  which  kt  varies  with  changing  values  of  the 
distance  function  by  reflecting  the  importance  of  the  distance  dt.  To 
adequately  determine  a  suitable  value  for  CO/,  it  is  necessary  to  assess 
the  relative  importance  of  obtaining  reports  on  information  element 
at .  Given  that  a  costless  direct  connection  between  two  nodes  results 
from  a  distance  cost  score  of  dt  =  1,  a  strong  connectivity  score  of 
kt=\  results.  As  the  distance  cost  increases,  the  connectivity  value 
approaches  0.  If  no  path  exists  between  any  source  of  information 
element  xt  and  its  destination,  then  dt  — >  °°  and  kt  =  0 . 


6  The  restriction  that  the  path  distances  always  exceed  1.0  accounts  for  the  fact  that,  for  con¬ 
nectivity  to  exist  at  all,  at  least  one  link  must  exist  between  source  and  destination.  The  case 
in  which  no  links  exist  implies  an  infinitely  long  path  resulting  in  0  connectivity. 
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The  strength  of  the  connectivity  among  all  the  nodes  in  a  path 
must  take  into  account  the  vulnerability  of  path  elements  (links  and 
nodes)  to  attack  or  failure.  We  can  account  for  this  using  the  connec¬ 
tivity  score  described  above  by  examining  its  value  as  we  remove  each 
node,  link,  or  both,  one  at  a  time  from  a  given  path  (which  we 
assume  here  is  the  shortest  path  and  has  rt  nodes).  For  simplicity,  we 
consider  only  the  loss  of  nodes  along  the  path.7  We  define  the  value 
’kt  as  the  connectivity  value  for  information  element  at  with  the  y'th 
path  node  removed.  We  create  a  depletion  vector,  L/;  whose 
elements  are  measures  of  how  much  connectivity  is  lost  by  the 
removal  of  each  node,  or  L/  =[//i,//2,-",4/]i  >  where  //,  =ki-]ki  and  rt 
(as  already  noted)  is  the  number  of  nodes  in  the  shortest  path  that 
delivers  information  element  al  from  any  source  to  its  destination. 

The  vector  L;  represents  the  vulnerability  of  the  shortest  path 
and  as  such  expresses  the  degree  of  uncertainty  associated  with  re¬ 
trieving  information  element  al  from  network  sources.  The  next  step 
is  to  reduce  the  vector  L;  to  a  scalar  that  can  be  used  to  reduce  k:, 
that  is,  to  reflect  the  path  uncertainty  in  terms  of  its  connectivity 
value.  A  logical  choice  is  the  vector  norm  defined  as 

IM=Vl 

The  vector  norm  measures  the  magnitude  of  the  vector  and 
therefore  in  this  sense  measures  the  magnitude  of  the  potential  deple¬ 
tions  based  on  the  shortest  path.  A  large  norm  indicates  that  the  de¬ 
pletion  associated  with  removal  of  nodes  from  the  shortest  path  is 
considerable.  This  means  the  connectivity  associated  with  the  shortest 
path  is  tenuous  and  should  therefore  be  reduced  accordingly.  Con¬ 
versely,  if  the  norm  is  small,  it  reflects  the  fact  that  the  shortest  path  is 
fairly  robust  and  the  reduction  in  the  connectivity  score  should  be 
minimal.  Taking  this  into  consideration,  the  adjusted  connectivity  for 
information  element  al  from  network  sources  to  a  single  destination 
is  calculated  to  be 


7  This  approach,  however,  is  equally  valid  if  applied  to  links  or  both  nodes  and  links. 
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k  ~k, 


where  |L7|  is  the  cardinality  of  the  vector  L;  and  0<p<l  is  the  edge 
expansion  parameter  of  the  network  that  reflects  the  reliability  of  the 
network  (Davidoff,  Sarnak,  and  Valette,  2003).  The  edge  expansion 
parameter  is  a  generalisation  of  the  clustering  coefficient  of  a  net¬ 
work,  moving  from  considering  single  nodes  to  clusters  of  nodes.  For 
example,  Watts  uses  the  clustering  coefficient  as  part  of  the  charac¬ 
terisation  of  small  world  networks  (Watts,  1999). 

The  most  reliable  network  is  one  in  which  every  node  is  directly 
connected  to  every  other  node.  Such  a  network  is  called  ‘complete’ 
and  leads  to  a  value  of  p  =  1 .  A  value  of  p  near  to  1  thus  implies  that 
there  are  redundant  paths  in  the  network  and,  hence,  high  reliability. 
The  edge  expansion  parameter  p  is  calculated  by  considering  clusters 
of  nodes  and  how  well  they  are  connected  to  the  rest  of  the  network. 
Formally,  for  a  finite  network  V,  consider  a  subset  U  of  V  and  let  |  U\ 
and  |fi|  represent  the  number  of  nodes  in  U  and  V,  respectively.  Let 
fcF xV  be  the  edge  set  of  V.  For  a  given  node  v  in  V,  define  the 
neighbours  of  v  as  T{v)  =  {u  e  V\{v,u)  £  E} .  For  the  cluster  U,  we  can 
then  define  the  neighbours  of  U  as  T(f/)  ='UvgUT(v).  The  boundary 
of  the  cluster  U  is  defined  as  the  neighbours  of  the  cluster  U  less  those 
nodes  actually  in  the  cluster  U,  i.e.,  dU  =  Y(U)  —  U.  Finally,  the  edge 
expansion  parameter  p  is  calculated  by  looking  at  the  ratio  of  the  size 
of  the  boundary  of  a  cluster  to  the  size  of  a  cluster,  considering  all 
clusters  within  the  network.  In  fact,  we  need  only  to  consider  clusters 
up  to  half  the  size  of  the  total  network  to  do  this;  thus, 

IWi 

p  =  mini  h— d  :U  ClV;0< 

IM 


Figure  5.1  illustrates  three  simple  cases,  which  are  fragments  of  a 
larger  network,  for  which  the  edge  expansion  parameter  is  p  =  0.5. 
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Figure  5.1 

Three  Simple  Connectivity  Assessments 


Case  1 


Case  2  Case  3 


d,  =  2 
k,  =  0.5 
L,  =  [0.5,0] 

INI  =  0.5 
^  =  0.5(1-0.5/2)2 
=  0.281 


d,=  2 

*r,  =  0.5 
L,=  [0.5,0.25] 

IIMI  =  0.559 
kj  =  0. 5(1-0. 559/2)2 
=  0.259 


d,  =  2 

k,=  0.5 
L,=  [0.5,0.17] 

IM  =  0.527 
k]  =  0. 5(1-0. 527/2)2 
=  0.271 
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We  assume  that  the  distance  function,  for  information  element 
at  from  a  single  source  is  measured  as  the  number  of  nodes  between 
the  source  and  the  destination.  In  addition,  we  set  the  decay  factor  as 
£0/  =1. 

In  all  three  cases,  the  initial  connectivity  score  is  0.5.  In  case  1, 
removing  the  source  (node  1)  results  in  a  total  loss  of  connectivity 
reflected  in  the  first  entry  in  L/ .  Removing  node  4  results  in  no  loss 
of  connectivity  because  there  exists  an  alternative  path,  not  including 
node  4,  of  the  same  length.  This  is  reflected  in  the  second  entry  in 
L/ .  In  the  second  case,  removing  node  1  has  the  same  effect  as  in  case 
1,  but  removing  node  6  has  the  effect  of  reducing  connectivity  by 
0.25.  The  entries  in  the  vector  L;  reflect  results  from  the  removal  of 
both  nodes  in  turn.  In  the  last  case,  removing  node  6  results  only  in  a 
loss  of  0.17  because  of  the  existence  of  a  shorter  alternative  path. 

The  examples  in  Figure  5.1  all  have  a  single  source  for  the 
information  element  at.  A  more  realistic  example  would  be  one  in 
which  there  are  several  sources  for  the  same  information  element. 
Figure  5.2  examines  two  networks  with  three  source  nodes. 
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Figure  5.2 

Connectivity  Assessments  with  More  Than  One  Source  Node 


In  case  1,  the  shortest  path  is  from  3  to  8  to  9.  If  node  3  is 
eliminated,  the  shortest  path  has  four  links.  The  same  thing  happens 
if  node  8  is  removed,  which  results  in  a  depletion  vector  that  reflects  a 
loss  of  half  the  connectivity  score  for  both  nodes.  The  effective  con¬ 
nectivity  drops  from  0.5  to  0.338.  In  case  2,  the  addition  of  the  link 
between  nodes  4  and  9  provides  an  alternative  path  that  is  as  short  as 
the  original  path.  This  means  that  there  is  no  loss  in  connectivity. 

Accounting  for  the  quality  of  information  contained  in  accessi¬ 
bility,  X(k)  entails  replacing  the  binary  count  of  the  number  of 
required  information  elements  available  in  completeness  with  a  con¬ 
nectivity  score  for  each  of  the  cluster  critical  information  elements,  or 

c*o 

\ 


otherwise 
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where  k  =  X/=i  kt  and  C  is,  as  before,  the  total  number  of  information 
elements  needed  to  render  a  decision  within  the  cluster.8 


Network  Redundancy 

Network  redundancy  focuses  on  the  reliability  of  the  network — i.e., 
its  ability  to  enable  the  delivery  of  information  in  the  face  of  node 
loss,  system  outages,  inefficient  operating  procedures,  or  some  com¬ 
bination  of  all  these.  At  the  same  time,  a  network  can  encourage  the 
excessive  delivery  of  information,  thus  causing  delays  as  a  result  of  the 
time  and  resources  required  to  process  it  all.  Consequently,  network 
redundancy  can  be  both  a  cost  and  a  benefit  of  the  network  informa¬ 
tion  flow. 

In  Figure  5.3,  for  our  CEC  cluster  example,  we  assume  that  the 
node  in  the  centre  of  the  diagram  is  a  decision  node  within  the  clus¬ 
ter,  deciding  an  appropriate  response  to  an  incoming  missile  threat. 
The  three  nodes  labelled  ax  provide  position  and  velocity  informa¬ 
tion;  a2  provides  missile  type  information;  and  ai  provides  status 
information  on  friendly  response  systems  (go,  no-go).  The  nodes 
labelled  aA  and  a 5  are  also  providing  information;  however,  this 
information  is  not  necessary  to  the  node’s  decision  to  select  a  weapon 
system  to  engage  the  enemy  missile. 

The  command  nodes  receive  reports  on  the  missile’s  position 
and  speed  from  three  sources.  Because  both  will  change  over  time,  we 
can  expect  multiple  reports  from  each.  These  multiple  reports  require 
combining  in  some  way.  We  reflect  the  uncertainty  associated  with 
the  position  and  speed  of  the  missile  by  assuming  they  are  random 
variables  with  known  probability  distributions,  as  discussed  earlier. 
One  method  that  allows  for  the  sequential  updating  of  probability 


8  Note  that  this  formulation  assumes  that  all  information  elements  are  equally  important 
and  that  they  are  independent.  We  discuss  dependent  information  elements  in  Chapter 
Three. 
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Figure  5.3 

Node-Centric  View  of  Information 
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distributions  is  the  one  we  have  chosen:  Bayesian  updating.9  What¬ 
ever  method  used,  the  degree  to  which  the  reports  contribute  to  esti¬ 
mates  close  to  ground  truth  and  to  narrowing  the  distribution  vari¬ 
ance  can  be  considered  a  benefit  in  terms  of  redundancy. 

However,  all  things  being  equal,  the  more  sources  of  required 
information  and  the  more  frequent  the  reporting,  the  longer  it  takes 
for  the  decision  node  within  the  cluster  to  get  a  coherent  view  of  the 
situation.  This  results  from  the  fact  that  it  takes  time  to  process 
information  that  may  or  may  not  contribute  to  improving  the  quality 
of  the  estimates — essentially  what  is  referred  to  as  ‘information  over¬ 
load’.  In  addition,  some  of  the  sources  may  provide  disconfirming 
evidence.  The  value  of  the  disconfirming  evidence  can  be  good  or  bad 
depending  on  the  degree  to  which  it  reflects  ground  truth.  Neverthe¬ 
less,  its  presence  increases  uncertainty,  requires  time  to  evaluate,  and 


4  In  addition  to  Bayesian  updating,  the  Dempster  rule  of  combination  and  moving  averages 
may  be  used  to  combine  multiple  observations.  See  Pearl  (1987)  and  Shafer  (1976). 
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therefore  may  decrease  the  quality  of  the  estimates.  Finally,  it  is  also 
possible  that  raw  data  are  processed  before  being  sent,  thus  arriving  at 
the  command  node  as  information  that  is  time  stamped  with  the  time 
at  which  the  processing  ended.  This  possibility  introduces  an  addi¬ 
tional  latency  that  contributes  to  uncertainty. 

Unneeded  Information 

Dealing  with  information  that  is  not  needed  is  treated  as  a  pure 
cost.10  In  Figure  5.3,  the  two  information  elements,  a 4  and  rz3,  pro¬ 
vide  no  useful  purpose  to  the  missile  tracking  and  response  mission. 
The  costs  of  dealing  with  information  of  this  type  increases  with  the 
number  of  different  information  elements  arriving  at  the  command 
node  and  with  their  redundancy. 

The  Combined  Effects 

In  the  next  section,  we  develop  metrics  for  the  measures  just  dis¬ 
cussed.  The  result  will  be  an  overall  metric  for  network  plecticity.  For 
networks  with  inadequate  information  flow,  as  with  excessive  infor¬ 
mation  flow,  we  would  expect  low  plecticity  scores.  The  goal  is  to 
configure  the  information  flow  and  clustering  over  a  network  with 
established  link  connectivity  so  as  to  maximise  plecticity  as  measured 
in  terms  just  discussed.  If  we  assume  a  normalised  plectic  score,  with 
0  representing  no  plecticity  and  1  representing  maximum  plecticity, 
then  Figure  5.4  illustrates  how  the  costs  and  benefits  affect  this  score. 

•  Minimal  flow.  The  first  flow  depiction  in  Figure  5.4  represents 
minimal  information  flow  and  a  set  of  isolated  nodes.  Although 
depicted  as  having  no  information  flow,  in  reality  we  would 
expect  that  there  are  a  few  sources  of  required  information. 
Fiowever,  there  is  no  opportunity  to  share  information,  and  we 


10  This  is  not  always  the  case.  In  a  rapidly  evolving  combat  situation,  information  not 
needed  at  one  moment  can  become  crucial  the  next.  In  this  case,  it  is  important  that  the 
network  be  capable  of  adapting  rapidly.  However,  there  is  still  some  cost  associated  with 
accepting  and  processing  information  that  is  not  needed  to  prosecute  the  current  operation. 
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Figure  5.4 

Overall  Network  Plecticity 
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assume  that  the  decision  nodes  need  not  consult  with  each  other 
before  acting.  The  result  is  no  benefits,  no  costs,  and  therefore  a 
low  plectic  score. 

•  Excessive  flow.  Turning  to  the  last  flow  depiction  in  Figure  5.4, 
the  effects  of  information  overload  resulting  from  too  much 
required  and/or  unneeded  information  results  in  low  plectic  val¬ 
ues  as  well.  The  high  benefits  associated  with  a  rich  information 
flow  are  offset  by  the  high  costs  of  processing  excessive  informa¬ 
tion.  Information  can  be  shared  directly  among  all  the  nodes. 

•  Adequate  flow.  Finally,  the  centre  flow  configuration  in  Figure 
5.4  depicts  reasonable  redundancy  of  required  information  and 
limited  unneeded  information  sources,  thus  resulting  in  optimal 
plectic  values.  The  high  benefits  are  associated  with  just  the 
right  amount  of  information  flow  and  the  costs  associated  with 
processing  excessive  information  are  therefore  very  low.  The 
connectivity  is  rich,  allowing  for  direct  and  indirect  information 
sharing.  The  fewer  channels  per  node  result  in  fewer  network 
ties  and,  therefore,  a  more  manageable  network. 

The  Benefits  of  Redundancy 

As  mentioned  earlier,  redundancy  has  both  cost  and  benefit  aspects, 
each  requiring  definition  in  metric  form.  Multiple  reports  of  required 
information  from  several  sources  can  increase  the  reliability  of  the 
estimates  of  information  elements.  At  the  same  time,  too  many 
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reports  coming  into,  and  being  shared  around,  a  cluster  incur  a  cost 
because  of  information  overload,  reports  of  unneeded  information, 
and  possible  disconfirming  reports.  We  address  the  benefits  first,  but 
before  beginning,  we  recognise  the  possibility  that  because  the  source 
of  a  rendered  report  is  extremely  reliable,  its  benefit  might  be  consid¬ 
ered  equivalent  to  several  reports  from  less  reliable  sources.  This  adds 
a  complicating  factor  because  the  reliability  of  the  sources  of  all 
reports  must  be  assessed.  Assuming  the  data  are  available  to  make  this 
assessment,  we  can  provide  for  this  phenomena  through  suitable 
weighting. 

First,  we  let  r-(0;)  be  the  benefit  accruing  from  obtaining 
reports  on  the  value  of  information  element  ai  from  pt  sources, 
where  @;.  =  Xyi10,j,->  and  0 e  [1,°°)  measures  the  assessed  reliability 
of  the  report  on  information  element  a-t  from  source  sj(\<j<  pi). 
This  formulation  ensures  that  0  •  >  1 ,  as  long  as  at  least  one  report  is 
received  for  information  element  at .  Also,  if  all  sources  are  minimally 
reliable,  then  rj(Qi)  =  ri{pj) ,  since  0;>7  =  1  for  all  sources  s;.  As  with 
the  accessibility  metric  A,  we  restrict  r;  (0;  )  to  be  between  0  and  1.  In 
this  case,  rt  (0  • )  =  0  implies  no  benefit  from  redundancy.  This  result  is 
equivalent  to  the  case  in  which  a  reported  estimate  for  information 
element  at  emanates  from  a  single,  marginally  reliable  source 
( Qij  =1),  or  if  no  report  is  rendered,  i;(0-)  — >1  for  some  number  of 
sources.  A  suitable  model  that  reflects  this  behaviour  is 


0  otherwise 

The  parameter  8;  reflects  the  relative  importance  of  the  information 
element  ai .  If  a  single  report  from  an  extremely  reliable  source 
arrives,  it  can  be  given  a  large  weight  so  that  0,  =  0,j  is  large  and 
^•(0.)  — > 1  for  a  single  report.  This  metric  therefore  not  only  measures 
the  effects  of  redundancy  but  also  reflects  the  adequacy  of  the  source 
of  the  report.  Figure  5.5  illustrates  how  the  value  of  the  constant  0; 
influences  how  rapidly  redundancy  and  adequacy  scores  contribute  to 
convergence. 
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Figure  5.5 

The  Effect  of  8;  and  0,  on  the  Benefits  of  Redundancy 
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The  table  below  the  figure  records  the  data  used  to  construct  the 
graphs.  Note  that  for  the  third  entry,  only  a  single  report  source 
(/>3=l)  exists,  but  it  is  considered  more  reliable  than  the  two  and 
three  sources  for  entries  1  and  2.  However,  regardless  of  the  redun¬ 
dancy  scores,  the  impact  of  the  information  element  importance 
scores  is  dramatic. 

Having  determined  a  redundancy  benefit  for  each  information 
element  in  a  cluster’s  information  set,  we  now  combine  the  scores  to 
arrive  at  an  aggregate  score  for  the  required  information  set  available 
across  the  cluster.  Recall  that  the  total  number  of  required  informa¬ 
tion  elements  across  the  whole  network  is  N;  the  number  critical  to  a 
cluster  is  C,  where  C  <N;  and  the  number  of  required  information 
elements  available  within  the  cluster  is  n,  where  n<C .  If  we  let  the 
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vector  0  =  [01,02,---,0c]r  represent  the  value  of  reports  received 
from  the  V  =  [px,p2,---,pc\  sources,  we  can  construct  a  suitable 
normalised  aggregate  metric,  /?(©),  as  follows:11 

«(e)  =  ix£,,(0,) 

n 

where  y,  =  1  if  pi  >  1  and  0  otherwise.  No  penalty  is  assessed  for  miss¬ 
ing  information.  This  is  accounted  for  in  the  accessibility  score  dis¬ 
cussed  earlier.  In  the  case  in  which  n  =  C  =  0 ,  we  must  have  that 
0,  =0  and  therefore  f?(0)  =  O — i.e.,  there  is  no  redundancy  benefit, 
even  though  the  accessibility  score  is  X{k)  =  \. 


Combining  the  Benefits 

The  next  step  is  to  combine  the  beneficial  effects  of  information 
access,  X,  and  redundancy,  R,  into  a  single  metric  for  the  cluster.  To 
do  this,  we  choose  a  conditional  model.  The  benefits  of  redundancy 
depend  on  the  information  elements  received  by  the  cluster,  in  addi¬ 
tion  to  the  number  of  sources  for  each.  The  conditioning,  however,  is 
quite  weak.  For  example,  it  is  possible  for  a  cluster  to  have  perfect 
information  access  and  score  0  for  redundancy.  Conversely,  a  cluster 
with  very  limited  access  can  have  a  rather  large  redundancy  score.  But 
it  is  impossible  to  obtain  positive  redundancy  benefit  unless  there  is  at 
least  one  report  on  at  least  one  information  element.  A  simple  ratio, 
R(©)l X(k),  exposes  the  desired  relationship.  However,  the  ratio  is 
only  bounded  between  0  and  1  when  R(©)<X(k)  and  Xt(k)^0.  We 
can  modify  the  ratio  using  parameters  to  avoid  a  zero  denominator 


11  Implicit  assumptions  in  this  form  of  aggregation  are  that  (1)  the  value  attributed  to  the 
reports  is  linear  and  (2)  there  is  no  value  associated  with  the  interactions  among  the  reports. 
We  discuss  the  issue  of  multi-attribute  aggregation  in  Chapter  Three. 
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and  ensure  the  combined  metric  is  bounded  on  [0,1].  We  begin  by 
setting 

In  this  formulation,  the  parameter  [3  >  1  ensures  a  nonzero 
denominator  and  the  parameter  K>0  with  the  two  constants  c  and  d 
are  used  to  ensure  the  combined  metric  is  bounded  between  0  and  1 . 
The  parameters,  (3  and  K,  reflect  the  relative  importance  placed  on 
redundancy  and  completeness.  The  desired  boundary  conditions  are 
f?(0|0)  =  0  and  5(1|1)  =  1.  That  is,  obtaining  the  maximum  redun¬ 
dancy  given  maximum  access  produces  a  maximum  combined  score, 
whereas  it  is  impossible  to  achieve  any  redundancy  given  no  access  to 
the  critical  information  elements.12  The  first  condition  yields 

f?(0|0)  =  c-^  +  </  =  0  and  d  =  -c^\ 


hence,  we  get 


*[*(©)!*(*)] 


K+f?(@)  K 

cV-x(k)cf 


The  second  boundary  condition  yields 

s[i|i]=,JS±i_,JS  =  £^4  =  i: 

1  M  P  P(P-i) 


therefore, 


12  Two  other  ‘edge’  conditions  might  be  considered  as  well:  B(\  |  0),  and  B( 0  1 1).  The  for¬ 
mer  is  not  possible  because  it  is  impossible  to  accrue  any  benefit  from  redundant  reports  if 
critical  information  is  inaccessible.  The  latter  is  equally  impossible  because  it  suggests  that  no 
benefit  from  redundancy  is  possible  even  though  critical  information  is  totally  accessible.  If 
at  least  one  source  reports  on  each  critical  information  element,  then  R(@.)  >  0 . 
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(3+k 


and  d,  = 


k(P-1) 
(3  +  k 


This  gets  us  the  final  relationship, 


*[*(©)!  *(*)] 


(P-i)[kx-(*)+p*(©)] 

(p+K)[p-*(*)] 


which  is  bounded  between  0  and  1  and  exhibits  the  required  depend¬ 
ency  between  accessibility  and  the  benefits  of  redundancy.  Substi¬ 
tuting 

and  ^(0)  =  l--lf_1YIe“S'(0,'“1) 
v  7  n 


yields 


q«(e)|x(*)] 
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1  A=iYie 

n 


(P  +  K) 


P- 


fk* 

KCj 


The  Costs  of  Information  Within  a  Cluster 

The  contribution  of  costs  to  plecticity  within  a  cluster  arises  from 
three  sources:  (1)  information  overload,  (2)  disconfirming  evidence, 
and  (3)  incomplete  information.  The  latter  cost  is  included  in  the 
calculation  of  the  benefits  associated  with  information  accessibility. 
Disconfirming  evidence  has  been  covered  previously  as  well.  It  arises 
as  an  issue  when  reports  for  disparate  sources  and  sensors  must  be 
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combined  to  create  a  common  operating  picture.  In  this  section,  we 
focus  exclusively  on  the  costs  of  information  overload.  As  mentioned 
earlier,  information  overload  arises  from  too  many  sources  of  needed 
information  and  any  source  of  unneeded  information,  which  are  both 
functions  of  redundancy.  We  begin  with  the  costs  of  unneeded 
information. 

Costs  of  Unneeded  Information 

In  this  analysis,  the  supply  of  unneeded  information  places  a  burden 
on  the  node  receiving  it  and  sharing  it  around  the  cluster.  It  has  an 
immediate  negative  impact  in  that  it  must  be  processed  or,  at  a 
minimum,  interferes  with  the  receipt  of  needed  information.  How¬ 
ever,  as  more  of  it  is  supplied,  its  marginal  impact  is  reduced  in  the 
same  way  email  spam  is  dealt  with  in  a  modern  office  environment. 
Thus,  a  good  function  to  model  this  behaviour  is  the  exponential 

U(m)  =  l-e~Vm, 

where  m  is  the  number  of  sources  of  unneeded  information  and  v  is  a 
scaling  parameter  that  reflects  the  rate  at  which  unneeded  informa¬ 
tion  contributes  to  cost.  This  calculation  then  indicates  the  effect 
across  the  whole  cluster,  rather  than  at  an  individual  affected  node.  In 
this  case,  no  distinction  is  made  between  multiple  sources  of  the  same 
unneeded  information  and  multiple  sources  of  different  information 
elements.  Thus,  the  same  cost  results  from  the  same  information  ele¬ 
ment  supplied  x  times  or  x  different  information  elements  supplied 
once  each.  We  show  the  influence  of  v  on  the  cost  in  Figure  5.6.  As 
v  increases  from  zero,  the  saturation  point  is  reached  more  rapidly. 

Costs  of  Redundant  but  Needed  Information 

We  now  examine  the  effects  of  the  cluster  receiving  too  much  needed 
information.  As  mentioned  earlier,  an  overabundance  of  needed 
information  contributes  to  information  overload,  increases  the  likeli- 
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Figure  5.6 

Cost  of  Unneeded  Information 


RAND  MG226-5.6 


hood  that  some  of  the  information  will  be  disconfirming,  and  there¬ 
fore  may  cause  delays  in  processing.  The  costs  of  information  over¬ 
load  associated  with  needed  information  are  generally  minimal  for 
low  levels  of  redundancy.  Indeed,  at  these  levels,  the  benefits  far  out¬ 
weigh  the  costs,  as  discussed  earlier.  However,  at  some  point,  costs 
rise  sharply  so  that  the  marginal  cost  of  an  additional  source  of 
information  is  greater  than  the  previous  source.  At  some  further 
point,  this  cost  then  levels  off  so  that  the  marginal  costs  are  minimal. 
This  behaviour  is  best  described  using  a  logistics  response  function 
such  as  the  following: 

-{xWifi) 
l+-[XWiPi)  ' 

In  this  formulation,  the  pt  values  are  the  number  of  sources  for 
information  element  at  as  before  and  %  ■  and  (p.  are  shaping  parame¬ 
ters.  We  illustrate  the  influence  of  these  parameters  in  Figures  5.7  and 
5.8.  The  actual  values  will  depend  on  the  effects  of  receiving  extra 
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Figure  5.7 

The  Costs  of  Redundancy  for  (p  ■  =  1 


Figure  5.8 

The  Costs  of  Redundancy  for  =  — 6 


RAND  MG226-5.8 
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needed  information.  They  can  be  assessed  based  on  the  point  at 
which  the  extra  sources  of  needed  information  begin  to  have  a  detri¬ 
mental  effect  on  operations  at  the  node,  the  point  at  which  the  mar¬ 
ginal  cost  of  redundant  information  increases  rapidly,  and  how  soon 
after  this  the  saturation  point  is  reached — i.e.,  the  point  at  which  the 
marginal  costs  become  negligible. 

As  was  the  case  in  calculating  the  overall  benefit  of  redundancy, 
the  costs  of  oversupply  of  each  needed  information  type  can  be  com¬ 
bined  in  a  variety  of  ways.  For  simplicity,  we  expressed  it  here  as  a 
simple  sum.13 

g{  p) 

n 

i  „  ~{Xi+<H+Pi)  ’ 

i  Y'C  t 

=lY'  1  +  i e~{^i+pi) 

i  ,t  ,  f  1  if  />•  >  1 

where  P  =  [pvp2,--,pc\  andy;.=l  .  . 

10  otherwise 


Combining  the  Costs  of  Information  for  a  Cluster 

In  considering  the  overall  costs,  a  balance  is  struck  between  costs  of 
needed  and  unneeded  information.  Unfortunately,  the  two  are  not 
independent.  That  is,  the  presence  of  one  can  greatly  affect  the  cost  of 
the  other.  For  example,  dealing  with  redundant  needed  information 
in  the  absence  of  any  extraneous,  noncontributing  reports  is  clearly 
different  than  if  the  unneeded  reports  are  present.  However,  the 
nature  of  the  dependency  is  not  clear.  Consequently,  we  use  a  simple 
weighted  linear  sum  of  the  two,  or 

0[U(»»),G(P)]  =  aU(»)  +  (l  -  a)G(P), 
where  0  <  oc  <  1  is  a  relative  weight  parameter. 


13  The  same  assumptions  made  for  the  benefits  of  redundancy  apply  here  as  well. 
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Some  may  argue  here  that  two  functions  have  been  used  to 
model  what  is  essentially  the  same  cost:  information  overload.  How¬ 
ever,  it  is  considered  that  these  two  types  of  information  overload 
have  different  impacts  on  cluster  effectiveness.  Needed  information 
affects  the  amount  of  information  that  needs  to  be  processed,  but 
there  is  also  a  greater  chance  of  conflicting  information,  which  places 
an  additional  burden  on  the  cluster.  Unneeded  information  is  more 
easily  dismissed,  given  that  it  is  not  essential  to  the  user’s  needs. 


Combining  Costs  and  Benefits 

The  next  step  is  to  combine  the  costs  and  benefits  of  network  plec- 
ticity  for  a  cluster  within  the  network,  associated  with  the  mission  at 
hand.  The  term  ‘costs’  suggests  a  simple  cost-benefit  analysis  might 
be  appropriate.  In  such  a  case,  the  benefit  is  divided  by  the  cost, 
resulting  in  an  assessment  of  the  cost  for  a  unit  of  benefit.  However, 
in  this  analysis,  we  are  not  dealing  with  a  true  cost  in  the  cost-benefit 
sense,  but  rather  a  cost  more  closely  described  as  a  penalty.  We  began 
this  chapter  by  describing  the  characteristics  of  the  network-plecticity 
metric,  as  illustrated  in  Figure  5.4.  We  assume  each  of  the  clusters  in 
the  network  is  logically  connected  to  support  a  given  mission.  Plec- 
ticity  for  a  cluster  is  then  associated  with  the  flow  of  information 
associated  with  that  cluster.  Both  minimal  (inadequate)  flow  and 
excessive  flow  should  result  in  low  plecticity,  whereas  ‘optimal’  (ade¬ 
quate)  flow  should  result  in  high  plecticity.  Therefore,  for  each  cluster 
W in  the  network,  the  measure  of  network  plecticity  C(B,0)  is  calcu¬ 
lated  as  follows: 

C(B,  O)  =  5[*(0)  |  X(k)\  [l  -  0[u(m), U(P)]] 

(p  - 1)  [p R(®)  +  k  X[k)\  [l  -  aU(m)  -  (l  -  «)U(P)] 

"  (P  +  k)(P -*(*)) 
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Overall  Network  Performance 

The  last  step  is  to  combine  this  redundancy-based  plecticity  with  the 
benefits  of  collaboration  to  produce  a  metric  that  will  assess  the  per¬ 
formance  of  networked  decisionmaking  headquarters.  Collaboration 
measures  the  effects  of  information  sharing  across  a  cluster  on  infor¬ 
mation  completeness  and  accuracy  (i.e.,  bias  and  precision),  whereas 
redundancy-based  plecticity  measures  the  effects  of  redundant  infor¬ 
mation  and  the  degree  of  information  access.  The  former  assesses  the 
dynamic  nature  of  the  operation  conducted  on  the  network;  the  latter 
measures  the  effects  of  the  underlying  network  structure  and  is  there¬ 
fore  systemic.  All  the  dependencies  among  the  several  components  of 
collaboration  and  plecticity  are  not  generally  well  understood.  How¬ 
ever,  we  know  that  high-quality  performance  requires  good  cluster 
knowledge  and  the  means  to  share  it  and  that  scores  in  either  category 
are  penalised  by  deficiencies  in  the  other.  Therefore,  the  measure  of 
total  network  performance  is  taken  to  be 

Q(n,Kto)=if=1[c,(s,o)^.Kf , 


where  Xti  =  1  and  ^  is  the  total  number  of  clusters  across  the  net¬ 
work. 

For  values  of  Qin.K^)  close  to  1.0,  the  network  is  performing 
well  by  producing  the  information  required  to  take  decisions  within 
each  of  the  clusters  when  required.  However,  this  is  not  the  whole 
story.  The  next  step  is  to  assess  how  well  the  combat  mission  is 
accomplished.  As  important  as  good  decisions  are,  good  combat  out- 
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comes  are  the  ultimate  measure  of  the  value  of  network-centric  opera¬ 
tions. 


Summing  Up 

In  assessing  the  effects  of  networking  headquarters  on  increasing  deci¬ 
sionmakers’  knowledge  and  therefore  improving  decisions,  we  have 
analysed  the  network  in  terms  of  its  static  structure  and  of  the 
dynamics  associated  with  performing  the  operational  mission.  The 
former  resulted  in  the  development  of  a  structural  ‘plecticity’  metric 
for  each  cluster,  and  the  latter  in  a  dynamic  ‘knowledge’  metric  for 
each  cluster.  Both  these  metrics  were  developed  by  viewing  a  network 
of  connected  headquarters  as  a  set  of  clusters  within  each  of  which  all 
decision  nodes  (headquarters)  share  information.  They  are  then  com¬ 
bined  to  form  a  metric  of  overall  network  performance. 

In  the  process  of  developing  these  metrics,  we  have  appealed  to 
information  sciences,  probability  and  statistics,  estimation  theory, 
complexity  theory,  combinatorics,  and,  of  course,  a  large  measure  of 
heuristics.  In  the  process,  several  terms  were  introduced  as  shaping 
parameters.  For  the  most  part,  these  parameters  are  designed  to  reflect 
the  behaviour  of  both  physical  and  cognitive  phenomena.  Where  pos¬ 
sible,  we  suggest  methods  for  assessing  reasonable  values  for  these 
parameters.  Nevertheless,  we  recognise  that  establishing  methods  for 
assessing  these  values  is  an  open  research  question  that  will  require 
considerable  experimentation. 

The  aim  of  the  work  presented  in  this  chapter  is  to  contribute  to 
the  development  of  a  theory  of  such  complex  information  networks 
in  order  to  stimulate  both  further  theoretical  development  and 
experimentation.  Although  we  include  an  application  of  the  measures 
and  metrics  in  Appendix  C,  there  is  still  much  more  work  to  be  done 
in  progressing  this  new  science. 


CHAPTER  SIX 

Conclusion 


At  the  outset,  we  argued  that  it  is  important  that  military  planners 
responsibly  test  the  emerging  network-centric  concepts  before  their 
adoption.  Several  observers  concerned  about  the  ‘irrational  exuber¬ 
ance’  surrounding  the  claimed  benefits  of  network-centric  operations 
support  this  view  as  well.1  They  argue,  as  we  do,  that  the  claimed 
benefits  may  prove  to  be  true  but  that  analysts  should  strive  to  assist 
the  military  community  in  assessing  them.  This  recommendation 
implies  employing  the  full  range  of  analytic  techniques:  models, 
simulations,  exercises,  and  experiments.  The  problem,  however,  is  the 
paucity  of  tools  that  will  allow  us  to  quantify  the  benefits  of  local 
collaboration  and  clustering  across  an  information  network.  Although 
we  make  no  claim  that  the  methods  reported  here  are  definitive,  they 
do  represent  an  approach  that  draws  on  several  disciplines  to  assess 
how  well  alternative  operating  procedures  and  network  configurations 
contribute  to  the  decisions  made  by  headquarters  that  share  informa¬ 
tion  and  thus  develop  shared  awareness  and  collaboration. 

The  approach  taken  brings  together  two  key  ideas.  The  first  idea 
comes  from  previous  work  by  RAND  that  shows  how  Shannon 
entropy  can  be  used  as  the  basis  of  a  quantified  measure  of  the 
knowledge  resident  within  a  cluster  of  decisionmakers  who  share 
information.  Such  an  approach  allows  the  concept  of  full  shared 
awareness  to  be  precisely  defined  in  terms  of  such  clusters  and  also 


1  See,  for  example,  Giffen  and  Reid  (2003)  and  Barnett  (1999). 
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permits  the  measure  of  benefit  to  be  lifted  from  the  information 
domain  to  the  cognitive  domain  in  terms  of  our  process  model  of 
information  age  warfare.  The  second  idea  comes  from  Dstl  research 
on  the  representation  of  command  and  control  (and  the  other  associ¬ 
ated  elements  of  C4ISR)  in  aggregate  constructive  simulation  models 
of  conflict.  This  concept  has  resulted  in  the  Rapid  Planning  Process, 
which  gives  a  basis  in  terms  of  mathematical  algorithms  for  the  repre¬ 
sentation  of  expert  decisionmaking  in  fast-paced,  fluid  circumstances. 
These  ideas  are  brought  together  using  the  idea  of  ‘plecticity’  drawn 
from  our  view  of  the  network  as  a  complex  system.  Combining  col¬ 
laboration  and  plecticity  results  in  a  total  measure  of  the  benefits  and 
costs  associated  with  a  particular  local  collaboration  and  clustering 
across  such  a  network.  The  measure  captures  the  ability  of  the  clusters 
to  support  the  decisionmaking  process  at  a  key  decision  point,  in 
terms  of  determining  to  what  extent  the  distributed  decisionmakers 
are  within  their  ‘comfort  zones’  in  relation  to  the  values  of  their  key 
decision  elements,  which  are  shared  across  the  clusters. 

We  have  adopted  an  approach  that  first  deals  with  the  concep¬ 
tually  simplest  case,  when  the  information  elements  forming  the  basis 
of  the  decisionmaking  in  a  cluster  of  the  network  all  have  the  same 
distribution  of  uncertainty  (hence,  we  assume  they  are  all  normally 
distributed).  In  this  case,  with  full  shared  awareness  across  the  cluster, 
the  knowledge  available  to  the  cluster  can  be  quantified  on  the  basis 
of  the  variance  of  the  key  decision  elements  and  their  covariance, 
which  builds  up  over  time.  This  first  part  of  the  work  highlights  in 
particular  the  benefit  to  local  knowledge  of  such  covariance  (i.e.,  the 
degree  to  which  one  element  of  information  relates  to  another)  in 
quantifying  such  knowledge.  Such  a  measure  thus  relates  closely  to 
‘commonsense’  ideas  of  knowledge  in  terms  of  understanding  how  a 
number  of  elements  relate  one  to  another. 

We  then  deal  with  the  more  general  case  of  when  the  informa¬ 
tion  elements  shared  across  a  cluster  have  more  general  distributions 
of  uncertainty.  A  number  of  approaches  to  this  case  are  examined 
based  on  a  mixture  of  empirical  and  theoretical  ideas.  By  combining 
these  ideas,  it  is  possible  to  form  a  complete  chain  of  quantifica¬ 
tion — from  an  initial  network  architecture  and  local  collaboration, 
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through  the  formation  of  clusters  across  the  network,  through  to  the 
overall  plecticity  and  performance  of  such  a  network.  In  this  way,  dif¬ 
ferent  possibilities  for  collaboration  (and  hence  different  future  head¬ 
quarters  structures  based  on  such  distributed  clustering  and  local 
decisionmaking)  can  be  compared  in  terms  of  their  total  network  per¬ 
formance.  This  comparison  measures  the  ability  of  such  distributed 
decisionmakers  to  make  better  decisions,  based  on  better  under¬ 
standing  of  the  critical  information  elements  shared  across  collabo¬ 
rating  clusters  in  the  network. 


APPENDIX  A 


The  Rapid  Planning  Process 


Gary  Klein’s  Recognition  Primed  Decision  (RPD)  model  emphasises 
situation  awareness  (SA)  (Klein,  1989).  The  goal  of  the  SA  process  is 
to  provide  the  decisionmaker  (the  command  agent)  with  an  under¬ 
standing  of  what  is  happening  in  the  outside  world.  In  particular,  the 
command  agent,  through  SA,  tries  to  answer  the  question:  ‘Is  the 
situation  that  I  perceive  in  the  outside  world  one  that  I  recognise? 
Because  if  I  do  recognise  the  situation,  then  my  experience  (long-term 
memory)  tells  me  immediately  which  course  of  action  (CoA)  I  should 
adopt,  given  this  situation.’ 

The  focus  of  the  SA  process  is  thus  on  pattern  matching — 
analysing  the  information  available  about  the  outside  world  and  try¬ 
ing  to  match  the  perceived  state  of  the  world  to  one  of  an  existing 
array  of  patterns  held  in  the  command  agent’s  long-term  memory. 
Each  pattern  is  a  representation  of  a  situation ,  and  each  situation  is 
linked  directly  to  a  CoA  appropriate  to  that  situation.  This  linkage, 
held  in  the  command  agent’s  long-term  memory,  represents  the 
command  agent’s  experience  and  is  what  enables  the  command  agent 
to  make  decisions  rapidly  without  recourse  to  extensive  option 
generation  and  evaluation. 

We  model  this  behaviour  through  the  Rapid  Planning  Process. 
The  model  thus  comprises  four  main  stages:  (1)  observation  analysis 
and  parameter  estimation,  (2)  situation  assessment,  (3)  pattern 
matching  and  preferred  posture  selection,  and  (4)  posture  transition. 
We  discuss  the  first  three  in  the  context  of  a  simple  land  operation 
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example.  The  headquarters  model  is  concerned  only  with  the  overall 
process  up  to  the  decision. 


Stage  1:  Observation  Analysis  and  Parameter  Estimation 

Stage  1  involves  analysing  the  command  agent’s  current  observations 
of  the  battlespace,  which  comprise  data  received  by  the  command 
agent  via  its  sensors.  The  analysis  of  these  data  consists  of  data 
smoothing  and  parameter  (mean  and  covariance)  estimation.  Where 
the  variables  are  normally  distributed,  the  data  analysis  is  performed 
by  a  collection  of  dynamic  linear  models  (DLMs).  A  DLM  is  a 
mathematical  structure  for  short-term  forecasting,  modelling,  and 
analysis  of  time-series  processes  with  normal  errors.  (DLMs  are  fully 
described  in  West  and  Harrison,  1997.) 

A  Simple  Land  Operations  Example 

Figure  A.  1  illustrates  the  details  of  stage  1  of  the  Rapid  Planning 
Process  for  this  example.  We  assume  decisionmaking  is  based  on  the 
perceived  combat  power  ratio  (PCPR)  (see  stage  3).  The  command 
agent  deduces  the  PCPR  from  observations  (via  sensors)  of  two  quan¬ 
tities  in  the  local  area  of  interest,1  namely  enemy  combat  power  and 
friendly-force  combat  power.  These  two  data  input  streams  are  ana¬ 
lysed  independently  within  the  command  agent  via  a  pair  of  DLM 
class  II  mixture  models — one  model  tracks  the  enemy  combat  power 
values  while  the  other  model  independently  tracks  friendly-force 
combat  power  values.2  In  general,  each  class  II  mixture  model  com¬ 
prises  four  separate  DLMs:  a  ‘standard’  DLM,  an  outlier-generating 
DLM,  a  level  change  DLM,  and  a  slope  change  DLM. 


1  The  command  agent’s  local  area  of  interest  is  a  circular  region  centred  on  the  agent.  The 
radius  of  this  region  is  user  specified.  The  agent’s  recognised  picture  covers  only  this  region 
and  is  thus  ‘mobile’ — that  is,  it  moves  with  the  agent. 

2  See  West  and  Harrison  (1997),  §12.3. 
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Figure  A.1 

Stage  1:  Observation  Analysis  and  Parameter  Estimation 
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In  each  case,  we  look  at  a  time  series  of  observations  of  force  lev¬ 
els  (as  assessed  by  a  set  of  sensors  or  fed  to  the  commander  by  an 
information  source).  Each  DLM  represents  a  predisposition  by  the 
commander  to  look  at  the  series  of  estimates  in  a  particular  way,  tak¬ 
ing  account  of  other  contextual  knowledge  that  may  be  available  to 
him. 


•  The  ‘standard’  DLM  represents  the  assumption  by  the  com¬ 
mander  that  nothing  much  is  changing;  he  expects  that  the  time 
series  of  observations  will  carry  on  at  about  the  same  level. 

•  The  outlier  DLM  makes  the  assumption  that  the  current 
observation  is  a  significant  deviation  from  the  observations  seen 
so  far  (causing  a  much  higher  variance  in  the  series)  but  that  the 
series  is  expected  to  settle  back  to  the  previous  level. 
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•  The  level  change  and  slope  change  DLMs  represent  an  assump¬ 
tion  by  the  commander  that  there  will  be  significant  change  in 
level  or  slope  (rate  of  change)  of  the  series.  For  example,  if  the 
commander  has  access  to  his  superior  commander’s  plan,  he 
may  know  that,  at  a  certain  time,  additional  friendly-force  ele¬ 
ments  will  move  into  his  area  of  interest.  He  will  thus  be  predis¬ 
posed  to  look  out  for  this  when  tracking  the  value  of  his  own 
force  strength  over  time. 

Each  of  these  DLMs  is  equivalent  to  a  corresponding  hypothesis 
by  the  commander  about  what  is  happening  in  his  local  area  of  inter¬ 
est  while  tracking  the  critical  information  element  of  force  level  over 
time:  no  change;  a  blip,  which  can  be  ignored;  a  step  change;  or  a 
change  in  slope  (growth  or  decay).  When  we  have  a  vector  of  critical 
information  elements  making  up  the  commander’s  conceptual  space 
(also  called  the  common  relevant  operating  picture,  or  CROP),  these 
hypotheses  relate  to  the  likely  behaviour  of  the  values  of  the  critical 
information  elements  that  form  a  vector  characterising  the  conceptual 
space. 

The  ‘standard’  DLM  is  a  First-order  polynomial  DLM,  repre¬ 
senting  a  system  model  M 1  that  describes  a  constant  level  time  series. 
The  parameters  estimated  by  the  DLM  are  the  mean  and  variance  of 
the  time  series  level  denoted,  at  time  t,  by  m{t)  and  C(t),  respectively. 

The  other  three  DLMs  in  the  mixture  model  are  all  second- 
order  polynomial  DLMs.  The  outlier-generating  DLM  represents  a 
system  model,  M2 ,  that  describes  a  transient  in  the  time  series.  The 
level  change  DLM  represents  a  system  model,  A/3,  that  describes  a 
step  change  in  the  time  series.  The  slope  change  DLM  represents  a 
system  model,  MA ,  that  describes  a  slope  change  in  the  time  series. 
The  parameters  estimated  by  each  of  these  three  DLMs  are  the  mean 
values  of  the  level  and  the  growth  rate  of  the  time  series  denoted,  at 
time  t,  by  vector  m(r),  and  the  associated  covariance  of  the  level  and 
growth  rates  denoted,  at  time  t,  by  matrix  C(t) . 
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The  Common  Relevant  Operating  Picture 

The  CROP  (the  local  conceptual  space)  is  spanned  by  the  set  of  criti¬ 
cal  information  elements.  For  our  simple  example,  these  relate  to  the 
local  force  ratio.  In  each  case,  the  DLM  formulation  updates  the 
assessment  of  where  the  commander  perceives  he  is  located  within  the 
space  described  by  the  vector  of  information  elements.  This  corre¬ 
sponds  to  a  multivariate  normal  distribution.  The  commander’s  fixed 
patterns  correspond  to  particular  ‘areas’  within  this  space  that  he  con¬ 
siders  important,  such  as  good  own  force  level  and  poor  perceived 
enemy  force  level  locally.  To  each  of  these  fixed  patterns  is  associated 
a  particular  mission,  such  as  ‘advance’,  representing  the  direct  link 
between  situation  assessment  and  choice  of  feasible  CoA  required  by 
the  RPD  approach.  The  overlap  between  the  output  from  the  DLM 
and  the  fixed  patterns  is  used  to  update  the  probability  that  each  of 
these  patterns  is  the  most  relevant.3 

In  more  detail,  and  taking  as  an  example  enemy  and  own  force 
strengths  as  the  factors  forming  the  recognised  picture,  each  DLM 
mixture  model  operates  on  an  input  time  series,  i.e.,  a  sequence  of 
observations  received  from  external  sensors.4  For  one  mixture  model, 
the  input  time  series  comprises  observations  of  the  enemy  combat 
power  in  the  command  agent’s  local  area  of  interest;  this  series  is  de¬ 
noted  by  Ye{t)  in  Figure  A.  1  and  comprises  the  sequence 


For  the  other  mixture  model,  the  input  time  series  comprises  observa¬ 
tions  of  the  friendly-force  combat  power  in  the  command  agent’s 
local  area  of  interest;  this  series  is  denoted  by  Y0{t)  and  comprises  the 
sequence 


3  An  example  of  how  this  can  be  implemented  is  shown  in  Chapter  2  of  Moffat  (2002;  also 
see  p.  38). 

4  In  this  example,  we  focus  on  only  a  single  critical  information  element:  combat  power. 


96  Information  Sharing  Among  Military  Headquarters 


Note  that  the  observations  in  the  two  time  series  need  not  necessarily 
coincide  because  they  are  independent  input  streams. 

Each  DLM  mixture  model  processes  its  associated  time  series  of 
observations  in  the  same  way  (and  independently  from  the  other 
DLM  mixture  models).  We  describe  this  process  below  for  the  enemy 
combat  power  time  series;  an  analogous  process  operates  in  parallel 
for  the  friendly-force  combat  power  time  series.  Figure  A.l  shows  the 
state  of  the  parameter  estimation  process  after  the  observations  up  to, 
and  including,  Ye(t— 1)  have  been  processed  by  the  DLM  mixture 
model  and  before  the  next  observation,  Ye{t),  is  processed.  To  process 
the  next  enemy  combat  power  observation,  Ye{t)  is  fed  into  the  DLM 
mixture  model  and  analysed.  The  DLM  algorithms  follow  the  Baye¬ 
sian  methods  developed  in  West  and  Harrison  (1997).  At  each  stage 
of  the  process,  a  probability  is  computed  for  each  of  the  commander’s 
hypotheses  (corresponding  to  one  of  the  DLMs).  These  probabilities 
are  tracked  over  time  to  assess  whether  we  are  approaching  the 
boundary  of  the  ‘OK’  state,  i.e.,  the  probability  of  no  change  has 
declined  significantly.  The  following  are  key  outputs  of  the  mixture 
model: 

•  Updated  estimates  of  the  system  model  parameters.  These 
estimates  now  take  into  account  the  new  observation  Ye{t). 
There  are  four  sets  of  these  estimates,  denoted  (mf(r),Ce(r))*, 
where  k  e  [1,4]  is  the  DLM  type.  One  set  of  estimates  is  pro¬ 
duced  by  each  DLM  in  the  mixture  model.  The  particular  val¬ 
ues  (mf(r),Ce(r))7  are  the  current  estimates  of  the  mean  and  co- 
variances  of  the  enemy  combat  power  (level  and  growth)  on  the 
assumption  that  system  model  M1  represents  the  time  series 
seen  to  date. 

•  Likelihood  estimates  for  each  system  model.  This  is  the  likeli¬ 
hood  that  the  observation  Ye{t)  would  have  been  obtained  from 
each  system  model.  There  are  four  of  these,  one  for  each  DLM 
in  the  mixture  model.  The  likelihoods  are  denoted 
L(Ye(t)  |  Mk,De(t—\)),  where  De{t  —  1)  represents  all  observations 
seen  up  to,  but  not  including,  the  current  observation,  Ye{t). 
This  is  repeated  for  the  friendly-force  estimates. 
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•  Posterior  probabilities  that  each  hypothesis  is  correct.  The  pos¬ 
terior  probabilities  p(Mk  \  De{t))  are  the  probability  that  model 
Mk  best  describes  the  time  series  of  observations  seen  up  to  time 
t.  This  is  repeated  for  the  friendly-force  estimates. 

The  posterior  probabilities  p(Mk\De(t))  (for  the  enemy  combat 
power  observations)  and  p{Mk  \  Da{t ))  (for  the  friendly  force  combat 
power  observations)  are  updated  on  a  continuous  basis  as  part  of  the 
command  agent’s  sensing  cycle. 


Stage  2:  Situation  Assessment 

The  means,  covariances,  likelihood  estimates,  and  posterior  probabili¬ 
ties  are  input  to  stage  2  in  the  Rapid  Planning  Process.  Figure  A.2 
illustrates  the  processes  in  this  stage.  At  each  command  and  control 
cycle  (which  runs  independently  of  the  sensing  cycle),  the  command 


Figure  A.2 
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agent  performs  an  SA  to  decide  whether  the  perceived  situation, 
based  on  the  sensor  observations  made  to  date,  is  currently  ‘OK’  or 
‘Not  OK’.  The  situation  assessment  is  performed  in  two  steps. 

Step  1 — OK/Not  OK  Assessment 

The  first  step  of  the  SA  considers  the  enemy  combat  power  and 
friendly-force  combat  power  observations  separately,  as  follows. 
Examining  each  DLM  mixture  model: 

•  If  the  ‘standard’  DLM  has  the  highest  posterior  probability,  the 
situation  is  deemed  OK.  This  conclusion  is  based  on  the  fact 
that  the  combat  power  observed  is  currently  showing  a  steady 
level. 5 

•  If  any  of  the  other  three  DLMs  (the  outlier,  the  level  change,  or 
the  growth  change  models)  has  the  highest  posterior  probability, 
the  situation  is  deemed  Not  OK.  This  conclusion  is  based  on 
the  fact  that  the  combat  power  observed  has  changed  from  a 
steady  level. 

Step  2 — Initial  Situation  Assessment 

Step  1  generates  an  OK/Not  OK  result  from  each  DLM  mixture 
model.  In  the  second  step,  we  combine  these  results,  using  Table  A.  1, 
to  determine  an  overall  assessment  of  the  current  situation.  This 
corresponds  to  the  ‘storytelling’  level  of  SA  discussed  by  Klein  (1989). 


^  In  West  and  Harrison’s  version  of  the  DLM  class  II  mixture  model  (West  and  Harrison, 
1997,  §12.3),  the  ‘standard’  model  is  the  linear  growth  model  (the  second-order  polynomial 
DLM).  It  should  now  be  clear  why,  in  our  case,  we  actually  need  the  standard  model  to  be 
the  constant  model  (the  first-order  polynomial  DLM),  representing  a  system  model  that 
describes  a  constant  level  time  series.  It  is  because  a  linear  growth  model  used  as  the  standard 
model  (the  OK  model)  might  remain  the  most  likely  model  throughout — so  that  we  would 
interpret  the  situation  as  remaining  OK — while  actually  tracking  a  steady  drift  of  combat 
power  values  across  a  wide  range — so  that  the  situation  therefore  might  not  always  be  OK 
from  a  PCPR  perspective.  The  only  OK  situation  is  the  one  in  which  the  combat  power 
observations  are  remaining  more  or  less  constant — hence  the  use  of  the  constant  (first-order) 
DLM. 
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Table  A.1 

Initial  Situation  Assessment  Matrix 


Enemy  Combat  Power  Mixture  Model 


Friendly-Force  Combat 

Power  Mixture  Model 

OK 

Not  OK 

OK 

OK 

Not  OK 

Not  OK 

Not  OK 

Not  OK 

Thus,  the  overall  SA  is  OK  only  if  the  situation  is  OK  with 
respect  to  both  the  enemy  and  friendly-force  combat  power  observa¬ 
tions.  In  each  of  the  other  cases,  one  or  another,  and  possibly  both,  of 
the  SAs  are  Not  OK  because  there  has  been  a  significant  change  in 
the  enemy  and/or  friendly-force  combat  power  and  the  overall  SA  is 
deemed  Not  OK. 

The  idea  behind  the  SA  described  here  is  to  provide  an  initial 
OK/Not  OK  alert  to  the  command  agent.  If  the  situation  is  OK,  the 
command  agent  carries  on  doing  whatever  it  is  currently  doing — it 
remains  in  its  current  posture;  there  is  no  need  to  do  any  (stage  3) 
pattern  matching  and  preferred  posture  selection,  because  everything 
is  currently  OK. 

If,  however,  the  situation  is  Not  OK,  then  only  in  this  case  does 
the  command  agent  need  to  go  into  stage  3  of  the  Rapid  Planning 
Process  and  do  some  pattern  matching  to  find  out  if  a  change  in 
posture  is  required. 

If  the  situation  is  Not  OK,  the  command  agent  invokes  stage  3 
of  the  Rapid  Planning  Process  model.  Some  key  data  items6  are 
passed  to  stage  3 — namely  me{t)  and  mB{t),  the  current  best  estimates 
of  the  enemy  and  own  force  combat  power  values,  respectively,  and 
their  associated  variances,  Ce{t)  and  C0(t).  These  ‘best’  estimates  are 


6  In  this  version  of  the  Rapid  Planning  Process  model,  only  the  means  and  variances  of  the 
combat  power  values  are  used  in  stage  3.  We  do  not  forward  to  stage  3  any  of  the  additional 
information  that  is  actually  available  at  the  end  of  stage  2,  namely  the  growth  rate  and  its 
variance  (in  the  case  of  second-order  polynomial  DLMs)  and  knowledge  of  which  system 
models  are  the  better  descriptors  of  each  combat  power  time  series.  Future  enhancements  to 
the  model  will  likely  make  use  of  this  additional  information. 
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the  values  output  by  the  DLM,  in  each  mixture  model,  which  cur¬ 
rently  has  the  highest  posterior  probability. 


Stage  3:  Pattern  Matching  and  Course  of  Action  Selection 

Stage  3  of  the  Rapid  Planning  Process  model  attempts  to  recognise 
the  extant  battlespace  situation,  based  on  the  data  received  by  the 
command  agent,  and  identify  the  posture  (CoA)  appropriate  to  this 
situation.  Figure  A.3  illustrates  the  process. 

As  mentioned  earlier,  the  inputs  to  stage  3  are  the  current  best 
estimates  of  the  enemy  and  own  force  combat  power  values,  respec¬ 
tively,  and  their  associated  variances.  From  these,  we  calculate  the 
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PCPR  at  the  current  time  t,  denoted  PCPR(r) .  The  PCPR(r)  is  a  ran¬ 
dom  variable  with  probability  density  having  a  time-dependent  mean, 
|l2(r),  and  a  standard  deviation,  V2(r).  We  depict  these  elements  in 
Figure  A.3.  The  PCPR(r)  distribution  characterised  by  its  mean  and 
standard  deviation  is  input  to  the  main  pattern-matching  process. 

The  pattern-matching  process  (denoted  by  symbol  ®  in  Figure 
A.3)  compares  the  PCPR(r)  distribution  against  a  number  of 
patterns,  denoted  P(k).  Each  pattern  is  a  representation  of  one 
possible  situation  that  could  exist  in  the  battlespace.  Comparing  is 
aimed  at  selecting  the  most  likely  pattern  given  the  PCPR(r)  being 
compared.  The  comparison  (pattern  match)  of  PCPR(r)  against  a 
given  pattern,  P(k),  yields  two  outputs: 

•  Z(PCPR(t)  |  P{k)):  The  likelihood  that  PCPR(r)  would  have 
been  obtained  had  the  situation  in  the  battlespace  been  the  one 
represented  by  pattern  P(k). 

•  p(P(k)  |  D(t)) :  The  posterior  probability  that  pattern  P(k)  is  the 
one  that  best  represents  the  situation  in  the  battlespace,  given 
the  time  series  of  (enemy  and  own  force  combat  power)  observa¬ 
tions  seen  up  to  time  t  (i.e.,  the  current  situation). 

Fiaving  calculated  the  posterior  probability  of  each  pattern 
P(\),P(2),---,P(n),  we  select  the  pattern  P(k)  with  the  highest 
posterior  probability  as  the  one  that  best  represents  the  situation 
extant  in  the  battlespace.  The  situation  has  now  been  ‘recognised’. 

The  next  step — and  the  essence  of  the  RPD  model  of  decision¬ 
making — is  to  invoke  the  decisionmaker’s  experience  and  map  the 
recognised  situation  to  an  appropriate  CoA.  In  Figure  A.3,  experience 
is  represented  by  the  set  of  one-to-one  mappings  between  patterns 
P(i)  and  CoA (/)  for  all  i  e  [1  ,n]  stored  in  the  command  agent’s  long¬ 
term  memory.  Thus,  the  selected  pattern  P(k),  representing  the  rec¬ 
ognised  situation,  leads  directly  to  the  selection  of  an  appropriate 
CoA,  namely  CoA(k). 

CoA(£),  selected  in  this  way,  is  referred  to  as  the  preferred 
posture.  It  is  the  posture  that  the  command  agent’s  experience  says  is 
most  appropriate,  given  the  situation  recognised  in  the  agent’s  local 
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area  of  interest.  The  preferred  posture  is  then  passed  into  stage  4 
(posture  transition)  of  the  Rapid  Planning  Process  model.  We  do  not 
make  use  of  stage  4  of  the  Rapid  Planning  Process  in  the  method 
proposed  in  this  report.  Our  processing  terminates  with  the  selection 
of  CoA(k). 

Moffat  (2002)  details  the  mathematical  development  of  these 
algorithms  for  the  general  case  of  a  conceptual  space  with  several  fac¬ 
tors. 

Application 

The  following  is  an  example  application.  The  modelling  test  bed  used 
is  CLARION+,  an  experimental  test  bed  developed  by  the  Defence 
Science  and  Technology  Laboratory  (Dstl)  to  examine  the  effect  of 
such  decisionmaking  on  combat  behaviour.  Figure  A.4  is  a  screen 


Figure  A.4 

CLARION+  Screen  Image  of  Land-Air  Interaction 


RAND  MG226-A.4 
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image  from  CLARION+  that  shows  a  campaign-level  land-air  inter¬ 
action  between  two  forces  (Red  and  Blue)  in  which  Red,  using  a  bold 
command  strategy  developed  by  a  genetic  algorithm,  fixes  Blue  in  the 
south  and  then  flanks  north  to  exploit  a  hole  in  Blue’s  defence.  The 
boxes  with  a  single  diagonal  line  marking  are  airborne  sensors  that 
help  to  generate  the  operational  picture  and  assessment  of  enemy 
intent  on  which  the  plan  is  based. 

For  one  of  the  brigades  in  the  circle,  the  dynamics  of  the  Rapid 
Planning  DLMs  used  to  assess  the  level  of  enemy  force  strength  in  the 
local  area  of  interest  of  the  brigade  are  shown  in  Figure  A. 5. 

At  the  top  left-hand  part  of  the  figure,  the  mean  values  of  enemy 
force  strength  assessed  in  the  local  area  are  shown  for  each  of  the  four 
mixture  models  (standard,  outlier,  level  change,  slope  change).7  These 
values  vary  with  time  along  the  x-axis  and  grow  as  the  brigade 
encounters  an  enemy  group  in  its  local  picture. 


Figure  A. 5 

Rapid  Planning  Type  II  Mixture  Model  Depiction 


RAND  MG226-A.5 


7  In  Figure  A.5,  these  models  are  called  constant,  outlier,  level,  and  growth,  respectively. 
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The  bottom  left-hand  corner  of  the  figure  displays  the  time- 
varying  Bayesian  probabilities  that  each  of  the  four  models  is  a  correct 
assessment  of  the  situation.  These  probabilities  also  vary  with  time 
along  the  x-axis.  The  most  likely  hypothesis  moves  from  the  standard 
model,  through  the  outlier  model,  to  a  realisation  that  there  is  a  level 
change  in  enemy  combat  power  occurring  in  the  local  area.  The  con¬ 
stant  model  later  supersedes  this  again. 

At  the  top  right  is  a  display  of  the  probability  that  each  of  the 
fixed  patterns  (and  hence  the  associated  CoA)  is  the  best  pattern 
match  for  the  current  perceived  situation,  for  that  brigade,  at  the  time 
the  simulation  test  bed  was  paused.  The  possible  courses  of  action  are 
advance,  attack,  defend,  delay,  or  withdraw.  The  figure  shows  that  at 
the  time  the  simulation  was  paused,  the  local  commander  favoured 
advancing  or  attacking. 


APPENDIX  B 


Information  Entropy 


Claude  Shannon  first  introduced  information  entropy  in  1948.  In  the 
early  1940s,  it  was  generally  believed  that  noise  limited  the  flow  of 
information  through  a  channel.  That  is,  if  one  decreased  the  prob¬ 
ability  of  error  in  the  received  message,  the  true  rate  of  data  transmis¬ 
sion  decreased.  Consequently,  an  error-free  message  could  only  occur 
if  transmission  ceased!  Shannon  disproved  this  theory.  He  was  able  to 
show  that,  in  fact,  if  a  channel  had  nonzero  capacity  (calculated  from 
the  noise  of  the  channel),  an  arbitrarily  low  probability  of  error  could 
be  achieved  as  long  as  the  transmission  rate  was  below  the  channel 
capacity.  He  also  argued  that  random  processes  such  as  speech  and 
music  had  an  irreducible  complexity  below  which  signal  compression 
was  impossible.  He  referred  to  this  as  entropy  and  further  claimed  that 
if  the  entropy  at  the  source  of  a  communication  channel  was  less  than 
its  capacity,  an  arbitrarily  low  error  rate  could  be  achieved. 1 

It  is  this  reference  to  communications  as  a  stochastic  or  random 
process  that  leads  to  its  application  in  the  field  of  statistics.  In  his 
book  on  information  theory,  Solomon  Kullback  (1978,  p.  1)  cites 
several  sources  to  support  his  argument  that  the  statistical  theory  of 
communications  is  synonymous  with  communications  theory  and 
that  communications  theory  and  information  theory  are  also  syn¬ 
onymous.  Because  probability  distributions  describe  the  uncertainty 
associated  with  mutually  exclusive  and  collectively  exhaustive  events, 


1  This  summary  draws  on  Cover  and  Thomas  (1991),  Blahut  (1987),  and  Kullback  (1978). 
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it  is  natural  to  ask  about  uncertainty’s  complement,  that  is,  what  is 
known,  or  the  amount  of  information  available.  This  leads  us  to  the 
modern  use  of  information  theory  as  a  measure  of  the  average  infor¬ 
mation  available  in  a  probability  distribution. 


A  Statistical  Theory  of  Information 

Suppose  X  =  {x1,x2,---,xj  is  a  discrete  random  variable  with 
probability  mass  function  P(X  =  xt)  =  pt.  Each  of  the  xt  represents  an 
event  (as  do  conjunctions  and  disjunctions  of  the  x,),  the  occurrence 
of  which  imparts  information.  What  we  seek  is  a  measure  for  the 
amount  of  information  imparted.  It  seems  logical  to  assume  that  this 
amount,  whatever  it  is,  is  inversely  proportional  to  the  likelihood  that 
the  event  will  occur,  or 

'W'T 


For  example,  the  fact  that  the  sun  rose  this  morning  imparts  no 
information,  because  we  knew  it  all  along.  That  is,  the  likelihood  of 
its  occurrence  is  1.0.  Conversely,  being  told  that  you  have  just  won 
the  lottery  conveys  considerable  information  because  it  is  an  unlikely 
event. 

In  a  1928  paper,  Ralph  Hartley  was  the  first  to  suggest  the  use 
of  the  logarithm  in  a  measure  of  information  by  defining  a  measure 
of  information  to  be  the  logarithm  of  the  number  of  possible  symbol 
sequences  (Hartley,  1928).  Shannon  picked  up  the  idea  of  using  the 
logarithm  as  the  proportionality  constant  and  suggested  that  the 
amount  of  information  in  the  occurrence  of  an  event  is 


/(*•)=  bg 

V 


This  was  a  particularly  good  choice  because  it  is  closely  related 
to  the  concept  of  data  compression,  as  we  shall  see  next.  Shannon  was 
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concerned  with  the  output  of  a  discrete  information  source  where 
each  of  the  xt  represents  a  source  output  that  occurs  with  probability 
pt.  For  this  reason,  the  base  2  logarithm  was  used  and  information 
was  measured  in  terms  of  bits.1 *  However,  in  this  work  and  elsewhere, 
we  use  the  base  e  and  measure  information  in  terms  of ‘natural  units’, 
or  nats. 

The  next  step  is  to  assess  the  average  or  expected  information  in 
the  probability  distribution.  This  quantity  is  referred  to  as  informa¬ 
tion  entropy  or  Shannon  entropy  and  is  calculated  as 

T[-log(T(X))]  =  H(X)  =  -S;=1  p,  log  A  . 


The  quantity  H(X)  represents  the  mean  information  content  in 
P(X)  or  the  amount  of  uncertainty  in  P(X).  The  latter  interpretation 
implies  that  information  entropy  is  a  function  of  the  variance  of  a 
distribution.  This  is  the  case  and  is  readily  evident  in  continuous  dis¬ 
tributions.  If  the  base  2  logarithm  is  used,  it  is  also  the  number  of  bits 
required,  on  average,  that  are  used  to  describe  the  random  variable,  X. 

It  is  interesting  to  note  that  for  discrete  random  variables, 
entropy  is  indeed  bounded.  A  lower  bound  (maximum  certainty) 
occurs  when  pt  =  1  and  p-  =  0  for  all  j^i.  In  this  case, 

//(A)  =—  llogl— [n  —  l)0log0  =  0.3 

Therefore,  the  average  information  is  0  nats  when  there  is  no  un¬ 
certainty.  This  is  consistent  with  the  earlier  definition  of  information. 
At  the  other  end  of  the  spectrum,  complete  uncertainty  exists  when 
all  events  are  equally  probable.  The  entropy  calculation  in  this  case  is 


1  It  turns  out  that  one  bit  of  information  is  the  minimum  information  required  to  resolve 

the  uncertainty  in  a  situation  with  two  equally  probable  alternatives. 

3  It  can  be  shown  that 
lim 

x\oo:x  =  0 . 

x^>0 
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H{x)=  -X"=1  -  log  -  =  -  X”=I  log  n  =  log  n . 
n  n  n 

Consequently,  for  discrete  random  variables,  the  average 
information  of  the  probability  mass  function  is  bounded,  or 
H{X)  e  [0,log«].  We  show  later  that  this  is  never  true  for  continuous 
random  variables;  that  is,  the  entropy  for  continuous  random  vari¬ 
ables  is  unbounded. 


Differential  Entropy 

The  foregoing  discussion  assumed  that  the  random  variable  was  dis¬ 
crete,  and  we  were  able  to  show  that  the  entropy  of  its  probability 
mass  function  was  bounded.  In  information  theory,  this  is  the 
equivalent  to  stating  that  the  information  source  is  discrete  and  that  it 
generates  discrete  information  at  a  finite  rate.  In  contrast,  the  entropy 
of  the  density  function  for  a  continuous  random  variable  is  un¬ 
bounded.  In  information  theory,  this  is  equivalent  to  a  continuous 
information  source  that  can  assume  any  one  of  an  uncountable  infi¬ 
nite  number  of  amplitude  values,  thus  requiring  an  infinite  number 
of  binary  bits  for  its  complete  specification.  Because  this  is  never  pos¬ 
sible,  its  entropy  is  unbounded. 

Suppose  now  that  X  is  a  continuous  random  variable  with  prob¬ 
ability  density  function  f(x).  The  differential  entropy  of  X  in  nats  is 
defined  to  be 

H(X')=-\Z0f(x)\o%f(x)dx. 

Unlike  the  entropy  of  a  discrete  random  variable,  the  entropy  of 
a  continuous  random  variable  is  unbounded.  We  can  illustrate  this 
fact  by  approximating  the  continuous  probability  density,  f(x),  with 
a  probability  mass  function,  p(x) ,  that  is  constant  on  intervals  of 
width  Ax .  The  approximating  probability  density  function  has  prob¬ 
ability  pj  on  the  7th  interval.  To  ensure  that 
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X/>;  =1, 

j 

we  set  pi  =  p(xj) Ax,  where  xj  is  a  point  in  the 7th  interval  such  that 
p(xj) Ax  is  the  area  under  p(x)  in  the  y'th  interval.  The  entropy  of  the 
approximating  probability  distribution  is 

H(p)  =-'Lpjlogpj 

j 

=  “X  p(xj)  Ax  log[  p{xj  )Ax] 

j 

=  -X  />(x7)Axlog[/>(x7)]  -X/>(*y)Axlog[Ax] 

j  j 

=  -X  p(xj)  Ax  \oz[p{xj )]  -  log  [Ax] . 

j 

Now,  if  we  let  Ax  — >0,  the  summation  converges  to  an  integral, 
but  log[Ax]  — >  -00 .  Because  there  is  no  way  to  avoid  this  divergence, 
the  entropy  of  a  continuous  random  variable  is  always  unbounded. 

Differential  entropy  can  also  be  negative.  For  example,  consider 
a  random  variable,  X,  distributed  uniformly  from  0  to  a.  Its  density 
function  is 


f(x) 


{ 1  /a  if0<x<a 
[  0  otherwise 


The  differential  entropy  therefore  is 
=  —  Jq  —  log— dx  =log  a. 

CL  Cl 

Note  that  for  a<  1,  H(X)  =  log«2  <  0 .  H(X)  is  also  unbounded  at 
a  =  0. 

Suppose  X  is  a  continuous  random  variable  distributed  exponen¬ 
tially  with  mean  1  /  X .  The  density  function  for  X  therefore  is 

/(x)  =  ,  x  >  0 . 
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The  differential  entropy  is 


H(x )  =—  log 'ke’hcdx 

=  l-log  (kj) 


=  l°g 


f  \ 
e 

T, 


The  differential  entropy  for  several  probability  distributions 
have  been  tabulated  by  Thomas  Cover  and  Joy  Thomas  and  can  be 
found  in  their  book,  Elements  of  Information  Theory  (1991). 


APPENDIX  C 


Application  to  a  Logistics  Network1 


This  appendix  records  the  application  of  both  the  plecticity  and  col¬ 
laboration  metrics,  with  some  extensions,  to  the  logistics  example  dis¬ 
cussed  in  the  main  text.  As  mentioned  earlier,  it  is  important  to  assess 
the  effects  of  improved  decisionmaking  on  combat  outcomes.  The 
measures  and  metrics  we  have  developed  are  designed  to  assess  the 
degree  to  which  sharing  information  among  headquarters  in  a  clus¬ 
tered  network  contributes  to  improved  decisionmaking.  The  ultimate 
measure  of  this  effect  is  how  well  the  friendly  forces  achieved  their 
mission,  i.e.,  combat  effectiveness.  Consequently,  Dstl  has  developed 
a  spreadsheet  version  of  the  information-sharing  model,  the  Collabo¬ 
ration  Metric  Model  (CMM),  which  is  used  to  calculate  both  the 
plecticity  and  collaboration  metrics  described  in  the  text  for  specific 
clusterings  of  decisionmaking  nodes  across  an  information  network. 

Figure  C.l  summarises  a  methodology  for  assessing  alternative 
command  and  control  processes,  using  a  combination  of  combat 
simulation  modelling  and  the  CMM.  Information  flows  recorded  in 
the  simulation  model  are  used  as  inputs  for  the  CMM.  The  CMM 
results  may  then  be  used  to  select  preferred  network  structures  as  in¬ 
puts  to  the  simulation  model,  as  depicted  in  Figure  C.l  by  the  dashed 
line.  It  is  then  possible  to  relate  Measures  of  Command  and  Control 


1  The  analysis  presented  in  this  appendix  is  primarily  the  work  of  RAND  colleague  Chris 
Pernin  while  on  secondment  to  Dstl. 
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Figure  C.1 

Assessing  the  Effects  of  Information  Sharing  on  Combat  Effectiveness 


RAND  MG226-C 1 


Effectiveness  of  the  network  clustering  and  Measures  of  Force  Effec¬ 
tiveness,  thus  illustrating  the  relationships  between  information 
sharing  and  combat  effectiveness. 

The  CMM  can  handle  up  to  10  decision  nodes,  10  information 
elements,  and  10  information  sources  (see  Figure  2.2  in  Chapter  Two 
for  illustrations  of  these  different  network  elements).  This  capacity 
allows  a  reasonable  representation  of  a  rather  robust  headquarters. 
The  metrics  discussed  in  the  text  form  the  basis  of  the  Overall 
Network  Performance  metric  calculated  by  the  model  and  include 
both  the  static  systemic  measures  of  plecticity  and  the  dynamic 
measures  of  collaboration.  These  are  combined  to  arrive  at  a  single 
metric  to  assess  the  effects  of  collaboration  and  plecticity  across  a 
cluster  of  information-sharing  entities. 


Cases  Examined 

Three  logistics  command  and  control  structures  were  assessed  using 
the  CMM.  The  decision  made  in  all  cases  is  the  logistics  allocation 
decision  described  in  Chapter  Two;  except  that  in  this  application, 
the  resupply  of  ammunition,  not  fuel,  was  the  focus.  The  first  case  is 
a  supply-driven  network  similar  to  the  'push’  sustainment  model 
depicted  in  Figure  2.3  in  Chapter  Two.  In  this  case,  denoted  S,  the 
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Forward  Support  Group  (FSG);  the  Air  Assault  Brigade,  Brigade 
Supply  Area  (AA  Bde  BSA);  the  Armoured  Division,  Divisional  Sup¬ 
ply  Area  (Armd  Div  DSA);  and  the  Corps  Artillery,  Brigade  Supply 
Area  (Corps  Arty  BSA)  all  form  decision  nodes,  as  shown  by  the  rec¬ 
tangles  in  Figure  C.2.  Flowever,  there  is  no  information  sharing  to 
form  a  common  perception;  thus,  each  of  these  decision  nodes  is  a 
degenerate  ‘cluster’  consisting  of  one  node,  shown  by  the  dashed 
ellip  ses.  Information  on  logistics  demand  is  sent  to  these  second  and 
third  line  units  from  the  Attack  Helicopter  Regiment  Forward  Oper¬ 
ating  Base  (AH  Regt  FOB);  the  Armoured  Brigade,  Brigade  Supply 
Area  (Armd  Bde  BSA);  the  Mechanised  Brigade,  Brigade  Supply  Area 
(Mech  Bde  BSA);  and  the  Multiple-Launch  Rocket  System  Regiment 
Ammunition  Control  Point  (MLRS  Regt  ACP).  These  information 
sources  are  shown  as  circles  in  Figure  C.2.  The  amount  supplied  is 
based  on  a  set  expectation  of  use. 


Figure  C.2 

A  Supply-Driven  Information  Network:  Case  S 
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The  next  two  cases  are  demand  driven  and  denoted  as  D1  and 
D2.  Demand  driven  means  that  the  units  anticipate  their  supply 
requirements  and  decide  how  much  resupply  to  request,  or  'puli’, 
from  their  arbiters  at  the  next  command  echelon.  How  well  they  do 
this  depends  on  their  ability  to  share  information,  as  we  will  see. 

In  the  first  demand  case,  Dl,  depicted  in  Figure  C.3,  each  first 
and  second  line  unit  (10  units  in  total)  sends  its  demand  for  an  asset, 
which  is  met  by  the  resource  manager.  The  managers  deal  with  each 
demand  separately  (i.e.,  they  do  not  cross-correlate  demands  from 
different  subordinate  units).  In  this  case,  there  are  10  decision  nodes, 
each  of  which  forms  an  isolated  cluster  of  size  1 . 


Figure  C.3 

A  Demand-Driven  Information  Network  with  No 
Information  Sharing:  Case  Dl 
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The  second  demand-driven  network,  case  D2,  is  depicted  in 
Figure  C.4.  In  this  final  case,  each  of  the  three  second  line  logistics 
units  is  clustered  with  its  subordinates  into  a  full  information  sharing 
and  shared  awareness  cluster.  The  superior  units  use  their  knowledge 
of  all  their  subordinates’  information  elements  to  update  their  percep¬ 
tion  of  the  current  status  and  needs  of  each  unit. 

The  first  two  cases,  S  and  Dl,  are  extremes  in  logistic  decision¬ 
making.  The  first  case  uses  doctrine  to  push  materiel  to  the  units, 
regardless  of  unfolding  events.  The  amount  being  pushed  to  the  units 
is  decided  a  priori  and  is  not  updated  over  time.  The  second  case  uses 
a  daily  update  of  what  was  consumed  to  resupply  stocks  to  previous 


Figure  C.4 

A  Demand-Driven  Information  Network  with  Information  Sharing: 
Case  D2 
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levels.  The  third  case  (D2)  is  a  variant  on  the  second  case  but  con¬ 
tains  additional  clustering  of  information.  This  case  uses  three  clus¬ 
ters  that  contain  the  1 0  decision  nodes. 


Discussion  and  Results 

Figures  C.5  and  C.6  show  two  metrics  calculated  by  the  CMM.  Fig¬ 
ure  C.5  is  the  Overall  Network  Performance  (combining  collabora¬ 
tion  and  plecticity)  for  each  of  the  three  options.  These  values  can 
range  from  0  (very  poor  performance)  to  1  (excellent  performance). 
The  shaded  region  defines  the  minimum  and  maximum  of  the  value 
over  the  24-hour  scenario;  the  black  bar  shows  the  average  over  time. 
From  Figure  C.5,  we  can  see  that  the  most  significant  difference 
arises  from  the  clustering  of  decision  nodes.  Cases  D1  and  D2  have 
the  same  information  elements  and  number  of  decision  nodes.  They 
differ  crucially  in  the  number  of  clusters  sharing  information.  In  the 
former  case,  each  logistics  unit  is  introduced  to  one  information  ele¬ 
ment  and  develops  an  understanding  of  the  logistics  consumption 
based  on  that  information.  In  the  latter,  the  decision  nodes  are  able  to 
access  information  from  neighbouring  units  that  help  build  a  better 
understanding  of  the  situation.  Even  though  both  demand  cases  seem 
to  have  a  much  better  understanding  of  the  information  elements 
over  time  compared  with  the  supply-driven  case,  it  is  only  when  the 
information  is  shared  among  decision  nodes  that  the  increase  in 
Overall  Network  Performance  becomes  evident.  In  this  example,  the 
sharing  of  information  provides  a  greater  increase  to  the  overall  ability 
of  the  network  to  perform  compared  with  the  location  of  the  deci¬ 
sionmaking. 

Figure  C.6  records  the  knowledge  derived  from  collaboration 
only,  that  is,  the  dynamic  elements  of  the  information  network.  The 
collaboration-based  knowledge  metric  measures  the  knowledge 
gained  from  the  dynamics  of  the  information  network,  as  discussed  in 
Chapter  Four. 
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Figure  C.5 

Overall  Network  Knowledge 
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Figure  C.6 

Collaboration-Based  Knowledge 
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There  are  two  main  differences  among  the  three  cases.  The  first 
is  the  variation  within  each  data  set.  A  comparison  of  the  three  cases 
reveals  that  the  upper  case  (case  D2)  has  much  less  variation  between 
adjacent  points  than  the  lower  two  cases  (cases  D1  and  S).  The 
enhanced  clustering  in  case  D2  compared  with  D1  has  perhaps 
relieved  the  uncertainty  of  unexpected  changes  in  the  information 
elements.  A  reduced  sensitivity  to  changes  in  the  information  ele¬ 
ments  is  reflected  in  a  less  volatile  and  smoother  line.  The  knowledge 
of  three  units  engaged  in  a  sudden  change  in  their  supply  level  will  be 
more  tmderstandable  or  palatable  to  a  commander  than  if  only  one 
unit  experiences  that  change. 

The  second  difference  among  the  data  is  the  level  of 
collaboration-based  knowledge.  Case  S  exhibits  the  lowest  knowledge 
level,  reflecting  the  large  differences  between  the  average  doctrinal  use 
compared  with  the  actual  use  during  combat.  The  two  demand  cases 
provide  enhanced  knowledge  compared  with  the  supply  case  because 
the  baseline  is  much  more  closely  related  to  the  actual  use.  The  differ¬ 
ence  between  the  two  demand  cases  provides  the  value  of  shared 
information  between  peers.  The  information  elements  and  baselines 
are  the  same  in  both  demand  cases.  However,  the  system  values  cal¬ 
culated  through  the  dynamic  linear  models  (see  Appendix  A)  are 
much  closer,  and  hence  have  enhanced  knowledge,  in  the  case  of  the 
more  collaborative  network.  In  this  example,  the  three-cluster 
demand-driven  network  (case  D2)  provides  the  clearest  picture  of  the 
consumption  of  the  subordinate  units. 
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